Can I supply a URL to lxml.etree.parse on Python 3?

前端 未结 1 1630
轻奢々
轻奢々 2021-01-05 05:59

The documentation says I can:

lxml can parse from a local file, an HTTP URL or an FTP URL. It also auto-detects and reads gzip-compressed XML files

相关标签:
1条回答
  • 2021-01-05 06:48

    The issue is that lxml does not support HTTPS urls, and http://pypi.python.org/simple redirects to a HTTPS version.

    So for any secure website, you need to read the URL yourself:

    from lxml import etree
    from urllib.request import urlopen
    
    parser = etree.HTMLParser()
    
    with urlopen('https://pypi.python.org/simple') as f:
        tree = etree.parse(f, parser)
    
    0 讨论(0)
提交回复
热议问题