Python: How to get StringIO.writelines to accept unicode string?

后端 未结 4 1379
予麋鹿
予麋鹿 2020-12-28 15:06

I\'m getting a

UnicodeEncodeError: \'ascii\' codec can\'t encode character u\'\\xa3\' in position 34: ordinal not in range(128)

on a strin

相关标签:
4条回答
  • 2020-12-28 15:35

    You can wrap the StringIO object in a codecs.StreamReaderWriter object to automatically encode and decode unicode.

    Like this:

    import cStringIO, codecs
    buffer = cStringIO.StringIO()
    codecinfo = codecs.lookup("utf8")
    wrapper = codecs.StreamReaderWriter(buffer, 
            codecinfo.streamreader, codecinfo.streamwriter)
    
    wrapper.writelines([u"list of", u"unicode strings"])
    

    buffer will be filled with utf-8 encoded bytes.

    If I understand your case correctly, you will only need to write, so you could also do:

    import cStringIO, codecs
    buffer = cStringIO.StringIO()
    wrapper = codecs.getwriter("utf8")(buffer)
    
    0 讨论(0)
  • 2020-12-28 15:38

    You can also encode your string as utf-8 manually before adding it to the StringIO

    for val in rows:
        if isinstance(val, unicode):
            val = val.encode('utf-8')
    result.writelines(rows)
    
    0 讨论(0)
  • 2020-12-28 15:42

    StringIO documentation:

    Unlike the memory files implemented by the StringIO module, those provided by [cStringIO] are not able to accept Unicode strings that cannot be encoded as plain ASCII strings.

    If possible, use StringIO instead of cStringIO.

    0 讨论(0)
  • 2020-12-28 15:44

    Python 2.6 introduced the io module and you should consider using io.StringIO(), "An in-memory stream for unicode text."

    In older python versions this is not optimized (pure Python), in later versions this has been optimized to (fast) C code.

    0 讨论(0)
提交回复
热议问题