问题
I use pytrends
to automatically download data in csv
from google trend. The code i used is below. In this case, i am downloading a monthly google trend data from 2008 to present.
from pytrends.request import TrendReq
from urllib.parse import unquote
from dateutil.relativedelta import relativedelta
import datetime
import pytrends
google_username = "xxxxx@gmail.com"
google_password = "xxxxx"
search_term = unquote('%2Fm%2F07gyp7')
google_trend = TrendReq(google_username, google_password, custom_useragent='Pytrends' )
google_trend_payload = {'gprop' : 'news' , 'q': search_term}
trendresult = TrendReq.trend(google_trend_payload, return_type = 'dataframe')
print(trendresult)
The result from google website for the first 5 months compared with the result from pytrends:
Date Pytrends data Manual csv data
2008-01 21.0 28.0
2008-02 16.0 19.0
2008-03 16.0 21.0
2008-04 15.0 18.0
2008-05 22.0 31.0
Anyone know the reason? Thank you.
回答1:
I had the same issue so I had to download manually during my project. Now, I have been aware of the reason. It is the sampling methods by google. Each day Google returns a different trend series. Imagine google has 10 millions servers, each day, for each query, it only samples maybe 10 k of its servers. So, in order to get consistent series, you can take 30 (or even 50) times and take the average. For series with values not quite small (maybe over 30 as minimum), the standard deviation is around 5% (acceptable).
The difference between manual and gtrend download may be related to the fact that they are not the same extracting data methods. The gtrend downloads the url of type https://www.google.com/trends/fetchContent.... And I do now know how the manual download is processed but I do know there are another way to extract data, like https://www.google.com/trends/trendsReport.. . The latter returns weekly series for everything (pretty rich).
At the moment, there seems to have quota limit problem.
来源:https://stackoverflow.com/questions/39652907/pytrends-trend-results-not-similar-with-manually-downloaded-data