I\'ve been making rather poor attempts at the PRIME1 problem on SPOJ. I discovered using that using ByteString really helped performance for reading in the problem text
Doing bulk input is usually faster with bytestrings, since the data is dense, there's simply less data to shuffle from the disk into memory.
Writing data as output however, is a little different. Typically, you're serializing a structure, generating many small writes. So the dense, bulk writes of bytestrings don't help you much in that case. Even regular Strings
will do reasonably at incremental output.
However, all is not lost. We can recover fast bulk writes by efficiently building up bytestrings in memory. This approach is taken by the various *-builder
packages:
Instead of converting values to lots of tiny bytestrings, and writing them out one at a time, we stream the conversion into an ever-growing buffer, and in turn, write that buffer in one big piece. This results in a lot less IO overhead, and performance improvements (often signficant) over string IO.
This kind of approach is taken by e.g. webservers in Haskell, or the efficient HTML system, blaze.
Also, the performance, even with bulk writes, will depend on the efficiency of whatever conversion function you have between your types and bytestrings. For Integer
, you could be simply copying the bit pattern in memory to output, or instead going through some inefficient decoder. As a result, you sometimes have to think a bit about the quality of the encoding function you're using, and not just whether to use Char/String or bytestring IO.
Note that performance isn't the main difference between ByteString
and String
. The former is for binary data while the latter is for Unicode text. If you have binary data, use ByteString
, if you have Unicode text, use the Text
type from the text package.