BeautifulSoup does not parse xml with other encoding than utf-8

前端 未结 2 670
花落未央
花落未央 2021-01-19 18:14

I can read all xmls files that starts with but I can not read the files starts with

2条回答
  •  星月不相逢
    2021-01-19 18:37

    I have the exact same problem. My workaround is to not read the xml declaration:

    with open('tests/xml-iso.xml', 'r', encoding='iso-8859-1') as f_in:
        f_in.readline()  # skipping header and letting soup create its own header
        xml_soup = Soup(f_in.read(), 'xml', from_encoding='ISO-8859-1')
    

提交回复
热议问题