scrapy: convert html string to HtmlResponse object

前端未结

关注

 2  1860

时光取名叫无心 2021-01-31 18:04

I have a raw html string that I want to convert to scrapy HTML response object so that I can use the selectors css and xpath, similar to scrapy\'s

2条回答

夕颜 (楼主)

2021-01-31 18:44
First of all, if it is for debugging or testing purposes, you can use the Scrapy shell:
```
$ cat index.html

    Test text


$ scrapy shell index.html
>>> response.xpath('//div[@id="test"]/text()').extract()[0].strip()
u'Test text'
```
There are different objects available in the shell during the session, like response and request.

Or, you can instantiate an HtmlResponse class and provide the HTML string in body:
```
>>> from scrapy.http import HtmlResponse
>>> response = HtmlResponse(url="my HTML string", body='Test text', encoding='utf-8')
>>> response.xpath('//div[@id="test"]/text()').extract()[0].strip()
u'Test text'
```
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...