BeautifulSoup `find_all` generator

后端 未结 3 972
情书的邮戳
情书的邮戳 2021-02-01 11:17

Is there any way to turn find_all into a more memory efficient generator? For example:

Given:

soup = BeautifulSoup(content, \"html.parser\         


        
3条回答
  •  猫巷女王i
    2021-02-01 11:43

    Document:

    I gave the generators PEP 8-compliant names, and transformed them into properties:

    childGenerator() -> children
    nextGenerator() -> next_elements
    nextSiblingGenerator() -> next_siblings
    previousGenerator() -> previous_elements
    previousSiblingGenerator() -> previous_siblings
    recursiveChildGenerator() -> descendants
    parentGenerator() -> parents
    

    There is chapter in the Document named Generators, you can read it.

    SoupStrainer will only parse the part of html, it can save memory, but it only exclude the irrelevant tag, if you html has thounds of tag you want, it will result same memory problem.

提交回复
热议问题