I\'m working on a web crawler that grabs data from sites all over the world, and is dealing with distinct languages and encodings.
Currently I\'m using the following
You can try utf_encode($str).
http://www.php.net/manual/en/function.utf8-encode.php#89789
Or you can replace the content type meta tag with
from header of crawled content