I\'m doing Sentdex\'s PyQt4 YouTube tutorial right here. I\'m trying to follow along but use PyQt5 instead. It\'s a simple web scraping app. I followed along with Sentdex\'s
you must call the QWebEnginePage::toHtml()
inside the definition of the class. QWebEnginePage::toHtml()
takes a pointer function or a lambda as a parameter, and this pointer function must in turn take a parameter of 'str' type (this is the parameter that contains the page's html). Here is sample code below.
import bs4 as bs
import sys
import urllib.request
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl
class Page(QWebEnginePage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebEnginePage.__init__(self)
self.html = ''
self.loadFinished.connect(self._on_load_finished)
self.load(QUrl(url))
self.app.exec_()
def _on_load_finished(self):
self.html = self.toHtml(self.Callable)
print('Load finished')
def Callable(self, html_str):
self.html = html_str
self.app.quit()
def main():
page = Page('https://pythonprogramming.net/parsememcparseface/')
soup = bs.BeautifulSoup(page.html, 'html.parser')
js_test = soup.find('p', class_='jstest')
print js_test.text
if __name__ == '__main__': main()
Never too late... I got the same issue and found description of it here: http://pyqt.sourceforge.net/Docs/PyQt5/gotchas.html#crashes-on-exit
I followed the advice of puting the QApplication in a global variable (I know it is dirty... and I will be punished for that) and it works "fine". I can loop without any crash.
Hope this will help.