Converting to Emoji

不想你离开。 提交于 2020-01-24 11:02:56

问题


so I am trying to take this data that uses unicode indicators and make it print with emojis. It is currently in a txt. file but I will write to an excel file later. So anyways I am getting an error I am not sure what to do with. This is the text I am reading:

"Thanks @UglyGod \ud83d\ude4f https:\\/\\/t.co\\/8zVVNtv1o6\"
"RT @Rosssen: Multiculti beatdown \ud83d\ude4f https:\\/\\/t.co\\/fhwVkjhFFC\"

And here is my code:

sampleFile= open('tweets.txt', 'r').read()
splitFile=sampleFile.split('\n')
for line in sampleFile:
    x=line.encode('utf-8')
    print(x.decode('unicode-escape'))

This is the error Message:

UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 0: \ at end of string

Any ideas? This is how the data was originally generated.

class listener(StreamListener):

    def on_data(self, data):
        # Check for a field unique to tweets (if missing, return immediately)
        if "in_reply_to_status_id" not in data:
            return
        with open("see_no_evil_monkey.csv", 'a') as saveFile:
            try:
                saveFile.write(json.dumps(data) + "\n")
            except (BaseException, e):
                print ("failed on data", str(e))
                time.sleep(5)
        return True

    def on_error(self, status):
        print (status)

回答1:


Your emoji 🙏 is represented as a surrogate pair, see also here for info about this particular glyph. Python cannot decode surrogates, so you'll need to look at exactly how your tweets.txt file was generated, and try encoding the original tweets, along with the emoji, as UTF-8. This will make reading and processing the text file much easier.




回答2:


This is how the data was originally generated... saveFile.write(json.dumps(data) + "\n")

You should use json.loads() instead of .decode('unicode-escape') to read JSON text:

#!/usr/bin/env python3
import json

with open('tweets.txt', encoding='ascii') as file:
    for line in file:
        text = json.loads(line)
        print(text)


来源:https://stackoverflow.com/questions/38106422/converting-to-emoji

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!