Python cStringIO take more time than StringIO in writing (performance of string methods)

血红的双手。 提交于 2019-12-05 06:53:22

The reason that StringIO performs better is behind the scenes it just keeps a list of all the strings that have been written to it, and only combines them when necessary. So a write operation is as simple as appending an object to a list. However, the cStringIO module does not have this luxury and must copy over the data of each string into its buffer, resizing its buffer as and when necessary (which creates much redundant copying of data when writing large amounts of data).

Since you are writing lots of larger strings, this means there is less work for StringIO to do in comparison to cStringIO. When reading from a StringIO object you have written to, it can optmise the amount of copying needed by computing the sum of the lengths of the strings written to it preallocating a buffer of that size.

However, StringIO is not the fastest way of joining a series of strings. This is because it provides additional functionality (seeking to different parts of the buffer and writing data there). If this functionality is not needed all you want to do is join a list strings together, then str.join is the fastest way to do this.

joined_string = "".join(testbuf for index in range(1000))
# or building the list of strings to join separately
strings = []
for i in range(1000):
    strings.append(testbuf)
joined_string = "".join(strings)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!