问题
I am compiling a corpus of Tweets for sentiment analysis and am trying to grab Tweets with Apple Emoji characters.
I have found the unicode character for one of the faces as: U+1F604 (U+D83D U+DE04), UTF-8: F0 9F 98 84
So far, I haven't been able to get any meaningful results. If I search \ud83d\ude04
I'll get some Tweets back, but nothing useful. \U0001f604
doesn't return anything on search.
Is there any way for me to query Twitter for these characters?
I am using the python-twitter wrapper for the API, but would be willing to use something else if a better alternative exists.
回答1:
As @Terence Eden points out, twitters REST search api doesn't work with emoji characters, but the streaming API does (as of Jan 2016).
There are a few tools out there for accessing twitters APIs in python. The one I've mostly used it tweepy
. It can be installed with pip
.
The tweepy docs on setting up the streaming api are quite easy to follow. The strings you filter on need to contain the actual emoji characters (e.g.: '😀').
Note that this searches for emojis as "words": that is, surrounded by white space. Something like "free😀" won't be found!
回答2:
This is possible - but it's slightly tricky....
You can't use the standard Twitter search - but you can use the Streaming Search.
There are open source libraries available at https://github.com/mroth/emojitrack-feeder in Ruby and Node.
来源:https://stackoverflow.com/questions/15589533/searching-for-tweets-with-unicode-character-apple-emoji