I am using Delphi 7 and ICS components to communicate with php script and insert some data in mysql database...
How to post unicode data using http post ?
Af
Your example code shows your data coming from a TNT Unicode control. That value will have type WideString
, so to get UTF-8 data, you should call Utf8Encode
, which will return an AnsiString
value. Then call UrlEncode
on that value. Make sure UrlEncode
's input type is AnsiString
. So, something like this:
var
data, date, username, passhash, datahash, note: AnsiString;
date := FormatDateTime('yyyymmddhh:nn',now);
username := Utf8Encode(edtUserName.Text);
passhash := getMd51(edtPassword.Text);
datahash := getMd51(data);
note := Utf8Encode(memoNote.Text);
data := Format('date=%s&username=%s&password=%s&hash=%s¬e=%s&action=%s',
[UrlEncode(date),
UrlEncode(username),
UrlEncode(passhash),
UrlEncode(datahash),
UrlEncode(note),
'i'
]);
There should be no need to UTF-8-encode the MD5 values since MD5 string values are just hexadecimal characters. However, you should double-check that your getMd51
function accepts WideString
. Otherwise, you may be losing data before you ever send it anywhere.
Next, you have the issue of receiving UTF-8 data in PHP. I expect there's nothing special you need to do there or in MySQL. Whatever you store, you should get back identically later. Send that back to your Delphi program, and decode the UTF-8 data back into a WideString
.
In other words, your Unicode data will look different in your database because you're storing it as UTF-8. In your database, you're seeing UTF-8-encoded data, but in your TNT controls, you're seeing the regular Unicode characters.
So, for instance, if you type the character "ش" into your edit box, that's Unicode character U+0634, Arabic letter sheen. As UTF-8, that's the two-byte sequence 0xD8 0xB4. If you store those bytes in your database, and then view the raw contents of the field, you may see characters interpreted as though those bytes are in some ANSI encoding. One possible interpretation of those bytes is as the two-character sequence "Ø´", which is the Latin capital letter o with stroke followed by an acute accent.
When you load that string back out of your database, it's still encoded as UTF-8, just as it was when you stored it, so you will need to decode it. As far as I can tell, neither PHP nor MySQL does any massaging of your data, so whatever UTF-8 character you give them will be returned to you as-is. If you are using the data in Delphi, then call Utf8Decode
, which is the complement to the Utf8Encode
function that you called previously. If you are using the data in PHP, then you might be interested in PHP's utf8_decode
function, although that converts to ISO-8859-1, which doesn't include our example Arabic character. Stack Overflow already has a few questions related to using UTF-8 in PHP, so I won't attempt to add to them here. For example:
I would expect (without knowing for sure) that you'd have to output them as &#nnnnn entities (with the number in decimal rather than hex ... I think)
Encode the UTF-8 data in application/x-www-form-urlencoded. This will ensure that the server can read the data over the http connection