I have a problem. I need to stop the execution of a function for a while, but not stop the implementation of parsing as a whole. That is, I need a non-blocking pause.
It
Request
object has callback
parameter, try to use that one for the purpose.
I mean, create a Deferred
which wraps self.second_parse_function
and pause
.
Here is my dirty and not tested example, changed lines are marked.
class ScrapySpider(Spider):
name = 'live_function'
def start_requests(self):
yield Request('some url', callback=self.non_stop_function)
def non_stop_function(self, response):
parse_and_pause = Deferred() # changed
parse_and_pause.addCallback(self.second_parse_function) # changed
parse_and_pause.addCallback(pause, seconds=10) # changed
for url in ['url1', 'url2', 'url3', 'more urls']:
yield Request(url, callback=parse_and_pause) # changed
yield Request('some url', callback=self.non_stop_function) # Call itself
def second_parse_function(self, response):
pass
If the approach works for you then you can create a function which constructs a Deferred
object according to the rule. It could be implemented in the way like the following:
def get_perform_and_pause_deferred(seconds, fn, *args, **kwargs):
d = Deferred()
d.addCallback(fn, *args, **kwargs)
d.addCallback(pause, seconds=seconds)
return d
And here is possible usage:
class ScrapySpider(Spider):
name = 'live_function'
def start_requests(self):
yield Request('some url', callback=self.non_stop_function)
def non_stop_function(self, response):
for url in ['url1', 'url2', 'url3', 'more urls']:
# changed
yield Request(url, callback=get_perform_and_pause_deferred(10, self.second_parse_function))
yield Request('some url', callback=self.non_stop_function) # Call itself
def second_parse_function(self, response):
pass