Forum Moderators: coopster

Message Too Old, No Replies

Php I18n

Need some insight

         

IanKelley

11:19 pm on Aug 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This isn't really a PHP issue as it would effect any language but since PHP programmers have the worst time trying to go international because of the multibyte problem (read wth were they thinking), I figured this is where someone else is most likely to have encountered the problem I'm having :-)

To illustrate the problem, here's a simple test:


$in = $_REQUEST['in'];
header("Content-type: text/html; charset=UTF-8");
print "$in";

If you are thinking "where's the <html><head>" etc.., rest assured that the test works exactly the same with them present.

Save as test.php and then run it a web browser as: test.php?in=人気のロ

(Note that the garbage following the = sign would actually be 4 chinese characters if WW was capable of handling them and should be replaced with 4 similar characters if you decide to try the test)

If your browser is, say, Firefox, the output will be what you'd expect.

However if your browser is IE The output will be:

?ヲ?ヲ?ヲ?

(Note that there would be no ヲ's, only?'s in the actual test, the ヲ's are added because WW edits consecutive question marks EVEN if you put them in a code block)

I have confirmed that IE is, in fact, using UTF-8 to display the page.

This only happens with a GET request. If you POST the same characters using a form then IE behaves as you'd expect.

So am I correct in assuming that there is a bug in the way that IE encodes data from the address bar?

If so it seems unlikely that there is a workaround but if there is I would love to hear about it.

siMKin

1:47 pm on Aug 23, 2006 (gmt 0)

10+ Year Member



So am I correct in assuming that there is a bug in the way that IE encodes data from the address bar?

It would be only a minor bug compared to all the others ;-)

If so it seems unlikely that there is a workaround but if there is I would love to hear about it.

did you try:
<?php
if (!isset($_GET['in']))
{
header("Location: ".$_SERVER['PHP_SELF']."?in=".urlencode("人気のロ"));
}
else
{
header("Content-type: text/html; charset=UTF-8");
print $_GET['in'];
}
?>

or maybe some other form of encoding like base64
[nl3.php.net...]

[edited by: siMKin at 1:48 pm (utc) on Aug. 23, 2006]