I read here that Intel introduced SSE 4.2 instructions
for accelerating string processing.
Quote from the article:
T
In regards to software libraries I would look at Agner Fog's asmlib. It has a collection of many routines, including several string manipulation ones which use SSE4.2, optimized in assembly. Some other useful functions it provides which I use return information on the CPU such as the cache size for each level and which extensions (e.g. SSE4.2) are supported.
http://www.agner.org/optimize/asmlib.zip
To enable SSE4.2 in GCC compile with -msse4.2 or if you have a processor with AVX use -mavx