Scrapy - parse a page to extract items - then follow and store item url contents

后端未结

关注

 2  477

不思量自难忘° 2021-01-30 04:42

I have a question on how to do this thing in scrapy. I have a spider that crawls for listing pages of items. Every time a listing page is found, with items, there\'s the parse_

2条回答

日久生厌 (楼主)

2021-01-30 05:23

I'm sitting with exactly the same problem, and from the fact that no-one has answered your question for 2 days I take it that the only solution is to follow that URL manually, from within your parse_item function.

I'm new to Scrapy, so I wouldn't attempt it with that (although I'm sure it's possible), but my solution will be to use urllib and BeatifulSoup to load the second page manually, extract that information myself, and save it as part of the Item. Yes, much more trouble than Scrapy makes normal parsing, but it should get the job done with the least hassle.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...