Forum Moderators: phranque
Meta tag iso-8859-1 <meta http-equiv="content-type" content="text/html; charset=iso-8859-1"/>
Warning Incorrect use of meta encoding declarations
Suggestion Non-UTF-8 character encoding declared
Changing the meta charset line (or global equivalent in htaccess) will not change the encoding of a file that already exists.
The charset/encoding declaration is purely for the browser's information; the browser has no way of knowing whether the encoding info is actually correct. Humans can tell, because some bits will display as garbage.
If you like NotePad?
Look at "Search and Replace 98" a free html editor that offers replacement of lines and/or portions.
But if I change my meta charset line and save the new url file as utf-8, that will take the place of my old iso url file - correct?
File change or global equivalent in htaccess - just do one or the other - not both?
If you find a text editor to change all the pages globally without opening them one at a time, you should be able to change the words at the same time. And as long as you're there, make sure every page has a lang="something", preferably attached to the <html> element. If any page title contains non-ASCII characters, make sure the charset declaration comes before the <title> tag.
when I used http tool, info showed error:
Http content-type iso-8859-1
byte order mark (BOM) No
Meta tag (utf-8)
I think this means there's something wrong on the server side
Opened in notepad++/encoding tab/"Encode in UTF-8 w/o BOM"/on some I also clicked "Convert to UTF-8 w/o BOM" because wasn't sure if I need to do one or both)/saved/downloaded.
This mapping is added to any already in force, overriding any mappings that already exist for the same extension.
<snip>
The AddCharset directive is useful for both to inform the client about the character encoding of the document so that the document can be interpreted and displayed appropriately, and for content negotiation, where the server returns one from several documents based on the client's charset preference.
This directive specifies a default value for the media type charset parameter (the name of a character encoding) to be added to a response if and only if the response's content-type is either text/plain or text/html. This should override any charset specified in the body of the response via a META element, though the exact behavior is often dependent on the user's client configuration. A setting of AddDefaultCharset Off disables this functionality. AddDefaultCharset On enables a default charset of iso-8859-1. Any other value is assumed to be the charset to be used
Short version: You don't need AddDefaultCharset at all. Use AddCharset and the server should be happy again.
Server setup
How to make the server send out appropriate charset information depends on the server. You will need the appropriate administrative rights to be able to change server settings.
Apache. This can be done via the AddCharset (Apache 1.3.10 and later) or AddType directives, for directories or individual resources (files). With AddDefaultCharset (Apache 1.3.12 and later), it is possible to set the default charset for a whole server. For more information, see the article on Setting 'charset' information in .htaccess.
But wait!
Somewhere along the line you mentioned a php setting. Are you also reading in material from a database? If there's an encoding mismatch, results can be calamitous, because some codepoints recognized in ISO-Latin-1 are simply not valid in UTF-8. In the case of displaying text it's not a major issue; you just see garbage. But if a file is trying to get information from a database in order to construct a page...
UTF-8 BOM found at start of file
The UTF-8 Byte Order Mark (BOM) was found at the beginning of the page.
AddCharset (500 internal server error)
AddCharset .html (500 internal server error)
AddCharset .php (500 internal server error)
AddCharset UTF-8 .php .html
AddCharset charset extension [extension] ... AddCharset .php .html (HAVE KEPT IN htaccess file for now even though getting error)
Are .php and .html the filename extensions you actually use? It doesn't matter about extensionless URLs, if any, just the physical files. Filetypes may be listed generically as "text/html", but when you name an extension in the form ".html" that means .html ONLY, not .htm.
Now my pages that I have not converted to saving as utf-8 w/o BOM yet show a dark little diamond with ? in middle instead of >.