I am trying to stream twitter data for a period of time of say 5 minutes, using the Stream.filter() method. I am storing the retrieved tweets in a JSON file. The problem is I am
Access the variable myListener.running but instead of passing MyListener directly to Stream create a variable as follows:
myListener = MyListener()
timeout code here... suchas time.sleep(20)
myListener.running = False
So, I was having this issue as well. Fortunately Tweepy is open source so it's easy so dig into the problem.
Basically the important part is this here:
def _data(self, data):
if self.listener.on_data(data) is False:
self.running = False
On Stream class in streaming.py
That means, to close the connection you just have to return false on the listener's on_data() method.
In order to close the stream you need to return False
from on_data()
, or on_status()
.
Because tweepy.Stream()
runs a while loop itself, you don't need the while loop in on_data()
.
When initializing MyListener
, you didn't call the parent's class __init__
method, so it wasn't initialized properly.
So for what you're trying to do, the code should be something like:
class MyStreamListener(tweepy.StreamListener):
def __init__(self, time_limit=60):
self.start_time = time.time()
self.limit = time_limit
self.saveFile = open('abcd.json', 'a')
super(MyStreamListener, self).__init__()
def on_data(self, data):
if (time.time() - self.start_time) < self.limit:
self.saveFile.write(data)
self.saveFile.write('\n')
return True
else:
self.saveFile.close()
return False
myStream = tweepy.Stream(auth=api.auth, listener=MyStreamListener(time_limit=20))
myStream.filter(track=['test'])