When is StringIO used, as opposed to joining a list of strings?

前端 未结 4 1487
甜味超标
甜味超标 2021-01-30 15:56

Using StringIO as string buffer is slower than using list as buffer.

When is StringIO used?

from io import StringIO


def meth1(string):
    a = []
             


        
相关标签:
4条回答
  • 2021-01-30 16:36

    Well, I don't know if I would like to call that using it as a "buffer", you are just multiplying a string a 100 times, in two complicated ways. Here is an uncomplicated way:

    def meth3(string):
        return string * 100
    

    If we add that to your test:

    if __name__ == '__main__':
    
        from timeit import Timer
        string = "This is test string"
        # Make sure it all does the same:
        assert(meth1(string) == meth3(string))
        assert(meth2(string) == meth3(string))
        print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
        print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
        print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
    

    It turns out to be way faster as a bonus:

    21.0300650597
    22.4869811535
    0.811429977417
    

    If you want to create a bunch of strings, and then join them, meth1() is the correct way. There is no point in writing it to StringIO, which is something completely different, namely a string with a file-like stream interface.

    0 讨论(0)
  • 2021-01-30 16:43

    The main advantage of StringIO is that it can be used where a file was expected. So you can do for example (for Python 2):

    import sys
    import StringIO
    
    out = StringIO.StringIO()
    sys.stdout = out
    print "hi, I'm going out"
    sys.stdout = sys.__stdout__
    print out.getvalue()
    
    0 讨论(0)
  • Another approach based on Lennart Regebro approach. This is faster than list method (meth1)

    def meth4(string):
        a = StringIO(string * 100)
        contents = a.getvalue()
        a.close()
        return contents
    
    if __name__ == '__main__':
        from timeit import Timer
        string = "This is test string"
        print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
        print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
        print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
        print(Timer("meth4(string)", "from __main__ import meth4, string").timeit())
    

    Results (sec.):

    meth1 = 7.731315963647944

    meth2 = 9.609279402186985

    meth3 = 0.26534052061106195

    meth4 = 2.915035489152274

    0 讨论(0)
  • 2021-01-30 16:54

    If you measure for speed, you should use cStringIO.

    From the docs:

    The module cStringIO provides an interface similar to that of the StringIO module. Heavy use of StringIO.StringIO objects can be made more efficient by using the function StringIO() from this module instead.

    But the point of StringIO is to be a file-like object, for when something expects such and you don't want to use actual files.

    Edit: I noticed you use from io import StringIO, so you are probably on Python >= 3 or at least 2.6. The separate StringIO and cStringIO are gone in Py3. Not sure what implementation they used to provide the io.StringIO. There is io.BytesIO too.

    0 讨论(0)
提交回复
热议问题