Is there any way to turn find_all
into a more memory efficient generator? For example:
Given:
soup = BeautifulSoup(content, \"html.parser\
Document:
I gave the generators PEP 8-compliant names, and transformed them into properties:
childGenerator() -> children
nextGenerator() -> next_elements
nextSiblingGenerator() -> next_siblings
previousGenerator() -> previous_elements
previousSiblingGenerator() -> previous_siblings
recursiveChildGenerator() -> descendants
parentGenerator() -> parents
There is chapter in the Document named Generators, you can read it.
SoupStrainer will only parse the part of html, it can save memory, but it only exclude the irrelevant tag, if you html has thounds of tag you want, it will result same memory problem.