Forum Moderators: coopster
In phpMyAdmin:
MySQL connection collation is utf8_unicode_ci.
My tables in the database also has collation utf8_unicode_ci.
In my webpages, I have <meta http-equiv="content-type" content="text/html; charset=utf-8"/> within <head><head/>
But when I enter some Chinese via PHP (actually, some input from a form), it shows up as gibberish (weird symbols) in the database. But the Chinese shows up correctly in the webpages. Before, the Chinese showed up as (e.g. 寊). I don't know what caused it to show as gibberish now.
Note: If I do an Insert of chinese characters in the database through phpMyAdmin, it stores as the actually Chinese characters.
I want to store the Chinese as 宠 format (because I believe that's how they're usually stored in the database. If not, please tell me). Can anyone give me some idea of how to store in 宠 format?
Any help is appreciated.
Thank you,
kbts
[edited by: eelixduppy at 2:23 pm (utc) on June 23, 2008]
[edit reason] disabled smileys [/edit]
I want to store the Chinese as 宠 format (because I believe that's how they're usually stored in the database. If not, please tell me).
Just a thought... I'm not sure that this is necessarily a good idea? 宠 is an HTML entity reference, which is OK if you want to retrieve and display the text in an HTML page, but not much else. What if you want to search for this character? You will first need to convert it into the numeric HTML entity reference, but where do you draw the line? Also, 宠 will take up 8 bytes, whereas the UTF-8 encoded character will certainly be <= 4.
You would need to use the HTML entity reference '宠' if you weren't using a unicode character encoding (UTF-8 in this case), but the big advantage of using UTF-8 is you don't need to.
penders, for utf-8 encodings, is it stored as "gibberish" in the mySQL database?
I would guess you might see "gibberish" if you try to view the UTF-8 encoded content (from the database) on a NON- UTF-8 encoded page. I can't really say whether this is correct, but you seem to imply it is being stored OK...
...it shows up as gibberish (weird symbols) in the database. But the Chinese shows up correctly in the webpages.
Note: If I do an Insert of chinese characters in the database through phpMyAdmin, it stores as the actually Chinese characters.
If I select something from the database to be displayed on a webpage, the Chinese shows up fine.
If I insert something into the database via an html form, the Chinese shows as gibberish when I view it with phpMyAdmin.
If I insert something into the database via phpMyAdmin, and then viewing that record in phpMyAdmin, it shows up as the Chinese character (non-gibberish).
My question is, can I assume everything is "OK" if the Chinese shows up correctly on the webpage (while the Chinese shows up as gibberish if I view via phpMyAdmin)?
Thanks,
kbts
It could be the way your forms are uploading or the way PHP is parsing the characters. It sounds like it is using a regular ASCII character set (iso-88whatever) at some point in the process / database storage / display process. Since the characters are displaying correctly and you say the database has UTF settings, I would check in the form processing.
I don't believe PHP currently has native support for UTF-8, which means strings will be handled by default as ISO-8859-1. I don't work with character sets in PHP, so I can't tell you any more, except that I think the PHP string encodes it as an ASCII string on input.
$connection = @mysql_connect(DATABASE_HOST, DATABASE_USER, DATABASE_PASSWORD);
# Set character_set_results
mysql_query("SET character_set_results=utf8",
$connection);
# Set character_set_client and character_set_connection
mysql_query("SET character_set_client=utf8",
$connection);
mysql_query("SET character_set_connection=utf8",
$connection);
Regards,
kbts
A SET NAMES 'x' statement is equivalent to these three statements:
SET character_set_client = x;
SET character_set_results = x;
SET character_set_connection = x;