I have the following piece of code which seems to be changing my character set.
$html = \"à\"; echo $html; // result: à $html = preg_replace(
I have the same problem. It is because of UTF8.
à is 0xc3a0 in UTF8. In PHP you can write like this: "\xc3\xa0".
à
0xc3a0
"\xc3\xa0"
With PCRE the /s match 0xa0 like it was ASCII "Non-breaking space".
/s
0xa0
You can use the u flag to resolve the problem.
$html = preg_replace("/\s/u", "", $html);