Understanding callbacks in Scrapy

后端未结

关注

 3  794

不知归路 2021-01-30 18:43

I am new to Python and Scrapy. I have not used callback functions before. However, I do now for the code below. The first request will be executed and the response of that will

3条回答

失恋的感觉 (楼主)

2021-01-30 19:42
in scrapy: understanding how do items and requests work between callbacks ,eLRuLL's answer is wonderful.

I want to add the part of item transform. First, we shall be clear that callback function only work until the response of this request dwonloaded.

in the code the scrapy.doc given,it don't declare the url and request of page1 and. Let's set the url of page1 as "http://www.example.com.html".

[parse_page1] is the callback of
```
scrapy.Request("http://www.example.com.html",callback=parse_page1)`
```
[parse_page2] is the callback of
```
scrapy.Request("http://www.example.com/some_page.html",callback=parse_page2)
```
when the response of page1 is downloaded, parse_page1 is called to generate the request of page2:
```
item['main_url'] = response.url # send "http://www.example.com.html" to item
request = scrapy.Request("http://www.example.com/some_page.html",
                         callback=self.parse_page2)
request.meta['item'] = item  # store item in request.meta
```
after the response of page2 is downloaded, the parse_page2 is called to retrun a item:
```
item = response.meta['item'] 
#response.meta is equal to request.meta,so here item['main_url'] 
#="http://www.example.com.html".

item['other_url'] = response.url # response.url ="http://www.example.com/some_page.html"

return item #finally,we get the item recording  urls of page1 and page2.
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...