findall

Word boundary with regex - cannot extract all words

丶灬走出姿态 提交于 2019-12-18 09:12:18
问题 I need extract double Male-Cat : a = "Male-Cat Male-Cat Male-Cat-Female" b = re.findall(r'(?:\s|^)Male-Cat(?:\s|$)', a) print (b) ['Male-Cat '] c = re.findall(r'\bMale-Cat\b', a) print (c) ['Male-Cat', 'Male-Cat', 'Male-Cat'] I need extract tree times Male-Cat : a = "Male-Cat Male-Cat Male-Cat" b = re.findall(r'(?:\s|^)Male-Cat(?:\s|$)', a) print (b) ['Male-Cat ', ' Male-Cat'] c = re.findall(r'\bMale-Cat\b', a) print (c) ['Male-Cat', 'Male-Cat', 'Male-Cat'] Another strings which are parsed

Regex backreference findall not working

断了今生、忘了曾经 提交于 2019-12-17 20:39:01
问题 I have recently been using regexes in a program. In this program I used them to find words in a list of words that matched a certain RE. However, when i tried backreferencing with this program, I got an interesting result. Here is the code: import re pattern = re.compile(r"[abcgr]([a-z])\1[ldc]") string = "reel reed have that with this they" print(re.findall(pattern, string)) What I expected was the result ["reel","reed"] (the regex matched these when I used it with Pythex) However, when I

Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method “find”, “findall”

﹥>﹥吖頭↗ 提交于 2019-12-16 22:10:31
问题 I want to use the method of "findall" to locate some elements of the source xml file in the ElementTree module. However, the source xml file (test.xml) has namespace. I truncate part of xml file as sample: <?xml version="1.0" encoding="iso-8859-1"?> <XML_HEADER xmlns="http://www.test.com"> <TYPE>Updates</TYPE> <DATE>9/26/2012 10:30:34 AM</DATE> <COPYRIGHT_NOTICE>All Rights Reserved.</COPYRIGHT_NOTICE> <LICENSE>newlicense.htm</LICENSE> <DEAL_LEVEL> <PAID_OFF>N</PAID_OFF> </DEAL_LEVEL> </XML

Python regex returns a part of the match when used with re.findall

邮差的信 提交于 2019-12-14 03:45:57
问题 I have been trying to teach myself Python and am currently on regular expressions. The instructional text I have been using seems to be aimed at teaching Perl or some other language that is not Python, so I have had to adapt the expressions a bit to fit Python. I'm not very experienced, however, and I've hit a snag trying to get an expression to work. The problem involves searching a text for instances of prices, expressed either without decimals, $500, or with decimals, $500.10. This is what

JSON data recognized as a string field instead of integer

老子叫甜甜 提交于 2019-12-13 21:33:41
问题 I use the below python code to manipulate a set of JSON files in a specified folder. I extract nt from the data and I want to create new key value pair. If I were to print nt on my screen I get values as shown below. nt 223 nt 286 nt 315 These looks like integers to me. However If I use Kibana visualization tool to process it says this (i.e NXT is an analysed string fields). I want these values to be recognized as integers? Does this have something to do with the way I am encoding my json

findAll on non object in extbase

大兔子大兔子 提交于 2019-12-13 07:40:39
问题 I just created an extension in typo3 4.5 with one model (product). I created the "productRepository" then injected it in the ProductController but I still get the Call to a member function findAll() on a non-object here is how the ProductController looks like : /** * @var Tx_PiProductDetail_Domain_Repository_ProductRepository */ protected $productRepository; /** * @param Tx_PiProductDetail_Domain_Repository_ProductRepository $productRepository * @return void */ public function

re.search not returning strings, but re.findall does

走远了吗. 提交于 2019-12-13 06:26:41
问题 I'm getting a bit confused here as to why this is happening. Here's the short and simple code: with open("file.xml") as xmlFile: # reading the xmlFile xmlLines=list() for line in xmlFile: newLine=xmlSearch.findall(line) print newLine RETURNS: (I changed the actual output for security reasons) [] [] [] [] [] ['TEXT_IN_STRING_FORMAT-SENSITIVE_DATA'] ['TEXT_IN_STRING_FORMAT-SENSITIVE_DATA'] ['TEXT_IN_STRING_FORMAT-SENSITIVE_DATA'] ['TEXT_IN_STRING_FORMAT-SENSITIVE_DATA'] ['TEXT_IN_STRING_FORMAT

Beautiful Soup findAll doesn't find them all

喜你入骨 提交于 2019-12-12 16:36:05
问题 I'm trying to parse a website and get some info with BeautifulSoup.findAll but it doesn't find them all.. I'm using python3 the code is this #!/usr/bin/python3 from bs4 import BeautifulSoup from urllib.request import urlopen page = urlopen ("http://mangafox.me/directory/") # print (page.read ()) soup = BeautifulSoup (page.read ()) manga_img = soup.findAll ('a', {'class' : 'manga_img'}, limit=None) for manga in manga_img: print (manga['href']) it only prints the half of them... 回答1: Different

Java: What is the best way to find elements in a sorted List?

橙三吉。 提交于 2019-12-12 12:00:33
问题 I have a List<Cat> sorted by the cats' birthdays. Is there an efficient Java Collections way of finding all the cats that were born on January 24th, 1983? Or, what is a good approach in general? 回答1: Collections.binarySearch(). Assuming the cats are sorted by birthday, this will give the index of one of the cats with the correct birthday. From there, you can iterate backwards and forwards until you hit one with a different birthday. If the list is long and/or not many cats share a birthday,

beautifulsoup4, correct way to use .find_all?

守給你的承諾、 提交于 2019-12-12 04:39:09
问题 If I parse a website using BS4, and from its source code i want to print the text "+26.67%" <font color="green"><b><nobr>+26.67%</nobr></b></font> I have been messing around with the .find_all() command (http://www.crummy.com/software/BeautifulSoup/bs4/doc/) to no avail. What would be the correct way to search the source code and print just the text? my code: import requests from bs4 import BeautifulSoup set_url = "*insert web address here*" set_response = requests.get(set_url) set_data = set