Python Memory Issue with BeautifulSoup

后端 未结 2 1779
旧巷少年郎
旧巷少年郎 2021-01-25 17:35

I\'ve resolved this issue, but I\'m wondering why it was caused in the first place. I used BeautifulSoup to identify this span from a webpage:

span = 

        
相关标签:
2条回答
  • 2021-01-25 18:12

    Probably because str(span.contents) is calling the __str__ function inside the object span.contents and returning a smaller representation. You can use the pympler to measure the memory consumption

    0 讨论(0)
  • 2021-01-25 18:24

    Old stuff, but just in case other people wonder: span.contents returns a reference to a NavigableString instance. There is a link between this instance and the DOM tree, so that as long as this instance is in use, the whole DOM tree cannot be released from memory by the garbage collector. Thus, as long as restaurant.name is not released from memory, the whole DOM tree is kept in memory.

    Using str(span.contents) returns a string which is not linked with the DOM tree, so it does not prevent the DOM tree from being released from memory.

    0 讨论(0)
提交回复
热议问题