Forum Moderators: open
I'm using Navicat to import an Excel file into a mySQL database. When I go to view my webpage, question marks appear instead of apostrophes, hyphens and other special characters.
I know this is a character encoding issue, with the format Excel uses causing the issue however I'm not sure how to solve it.
I want to still use MS Excel (workbooks) to manage my database content so I'm not sure how to get round the character encoding issue without using a text file etc.
Anyone have any experience with solving this?
Cheers
If it's still hosed up, you may just have to change your document encoding or change that the proper encoding is being sent from your server. More information [webmasterworld.com].
If it's fine that way and only messed up after storing it in your database, the issue is probably related to the mysql server collation or database and/or table collation [dev.mysql.com] you are using. Alter the tables, change the collation; you may have to re-import the data (but maybe not.)
I know this is a character encoding issue, with the format Excel uses causing the issue however I'm not sure how to solve it.
Ditto Bill's response, and I will add, are you sure it is not Word stuff pasted into Excel?
Curly apostrophes, em-dashes, etc. will drive a sane man daft. There are a few recent threads discussing this stuff.
willbfriendly - i do believe it is Word stuff pasted into Excel being the problem, however pasting it first into Notepad, and then into Excel doesn't fix the problem. I have to manually change the apostrohpes in notepad/or in excel in order to get them to appear correctly online.
Going to read the threads rocknbil has linked to now.
If you are using PHP, you can try something like this before you store the data:
$text = mb_convert_encoding($text, "EUCJP-WIN", "UTF-8");
Also, see here [webmasterworld.com] for a discusssion where I was dealing with a similar issue. I ended up with the following PHP function which does what I needed (you would probably have to tweak it for your needs).
function cleanUpHTML($text)
{
//discard unwanted tags
$text = strip_tags($text, '<p><b><i><ol><ul><li>');
//strip header stuff
$text = stristr($text, '<P');
//strip all attributes (Word garbage)
$text = preg_replace("/<(\w)[^>]*?>/s", "<$1>", $text);
//get rid of useless non breaking spaces
$text = preg_replace("/ /", "", $text);
//get rid of empty p's
$text = preg_replace("/<p><\/p>/i", "", $text);
$text = mb_convert_encoding($text, "EUCJP-WIN", "UTF-8");
return $text;
}