Behavior of the scrapy xpath selector on h1-h6 tags

后端 未结 2 1675
逝去的感伤
逝去的感伤 2021-01-16 19:59

Why does the following two code snippets give different outputs? The only difference between them is that the h1 tag in the first case is replaced with an

2条回答
  •  余生分开走
    2021-01-16 20:47

    Including p tags inside h# is invalid according to W3C. You can see more about this here

    Anyway, to bypass this and just work with any xml structure you can just change the type like this:

    sel = Selector(text="anyxml", type="xml")
    

    This will respect any xml structure.

提交回复
热议问题