Python: Split unicode string on word boundaries

前端 未结 9 873
清酒与你
清酒与你 2020-12-31 13:16

I need to take a string, and shorten it to 140 characters.

Currently I am doing:

if len(tweet) > 140:
    tweet = re.sub(r\"\\s+\", \" \", tweet)          


        
9条回答
  •  一整个雨季
    2020-12-31 13:46

    I tried out the solution with PyAPNS for push notifications and just wanted to share what worked for me. The issue I had is that truncating at 256 bytes in UTF-8 would result in the notification getting dropped. I had to make sure the notification was encoded as "unicode_escape" to get it to work. I'm assuming this is because the result is sent as JSON and not raw UTF-8. Anyways here is the function that worked for me:

    def unicode_truncate(s, length, encoding='unicode_escape'):
        encoded = s.encode(encoding)[:length]
        return encoded.decode(encoding, 'ignore')
    

提交回复
热议问题