I moved my php application to the new server. i use mysql5 db. When i\'m Updating or Inserting something to db, every \"
and -
sign changed to
SET NAMES UTF8
should be used on every page, when selecting as well as when updating or inserting.
actually this query must be used every time you connect to the database. just add it to connect code.
You need UTF-8 all the way through to make smart quotes and dashes (“”—) and other non-ASCII characters work reliably:
(1) Ensure that the browser sends you characters encoded to UTF-8. Do this by declaring the page that includes the form to be UTF-8:
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
...
(Ignore <form accept-encoding>
, which doesn't work right in IE.)
(2) PHP deals with raw bytes and doesn't care what encoding they're in, but the database does care, so you have to tell it what encoding the bytes from PHP are coming in. This is what SET NAMES
is doing, though mysql_set_charset may be preferable.
(3) Once the proper characters have reached the database, it'll need to store them in a Unicode encoding to make sure all characters can fit. Each column can have a different encoding, but you can use DEFAULT CHARACTER SET utf8
when you CREATE table
to make all the text columns in it use UTF-8. You can also set the default character set for a database or the whole server to utf8
if you prefer.
If you have already CREATE
d the tables and they a non-UTF-8 collation, you'll have to recreate or alter the tables. You can check the current collation using SHOW FULL COLUMNS FROM sometable;
.
(4) Make sure you HTML-encode text you output from PHP using htmlspecialchars()
and not htmlentities()
, which by default will mess up non-ASCII characters.
[You can, as an alternative to (2) and (3), just use the default Latin-1 encoding for the connection and the table storage, but put UTF-8 bytes in it nonetheless. The disadvantage of this approach is that it'll look wrong to other tools looking at the database, and lower/upper case characters won't compare against each other in the expected case-insensitive way.]
My guess is you are pasting from some text editor which is transforming the "
into an angled pretty quote, and transforming your -
into an mdash, which is causing both to be represented as ?
.
While you set your database to accept UTF8 characters, you probably did not set your webserver/PHP to accept those characters. Try playing with mbstring
functions, but check to make sure you arent using the slanted quotes or dashes.