Forum Moderators: open

Message Too Old, No Replies

Croatian language

Issues with characters showing correctly

         

Lynne

11:01 pm on Mar 1, 2010 (gmt 0)

10+ Year Member



Hello, and much thanks in advance for any help you may be able to give!

I have been given some Croatian text to load into a website previously created and I am having issues getting all the characters to display correctly. Some characters do, some don't. I've been trying different codes and this seems to work best (but still as I mentioned won't show ALL characters correctly).

meta http-equiv="content-type" content="text/html; charset=windows-1250"

meta http-equiv="content-language" content="hr"

Doc type is:
!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd"

This is just a basic html website, not xml or any php etc.

Oh, I have tried UTF-8, ISO-8859-2, but those two options seem to have a lesser success rate, more characters don't display correctly.

Any ideas?

Much Appreciation,

Lynne

bill

5:08 am on Mar 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Welcome to WebmasterWorld Lynne.

Croatian can be written in Latin or Cyrillic from what I've read. That could make a difference. Which is it?

The suggestions I'm seeing are to use UTF-8 wherever possible. The other option is to use the iso-8859-2 charset.

Have you tried adding the lang attribute to the HTML tag?

Lynne

6:02 am on Mar 2, 2010 (gmt 0)

10+ Year Member



Thank You so much for responding Bill :)

I have done all you've suggested, but the Windows-1250 worked best. The other two options are way more errors. Some letter types show as they should but some just become question marks.

I don't know if what I have is written in Latin or Cyrillic so I will check that out. I received a Word Document set with Croatian as the language and of course all those characters are as they should be ...


Many Thanks :)

Lynne

bill

7:11 am on Mar 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What do you know? Windows-1250 is supposed to have a few more character entities than iso-8859-2. Do you still get illegible characters using Windows-1250?

UTF-8 is still preferred, but you'd need to convert the MS Word text into Unicode.

Lynne

9:02 am on Mar 2, 2010 (gmt 0)

10+ Year Member



Thanks Bill :)

Um, I'm sure there is a simple answer to this but how do you convert word to unicode? I've not had to know before, thus don't know, LOL :)

Have just found something very interesting on a website that explains what unicode is and had a link to an explanation page written in Croatian. When I checked the page encoding, the content language code was "cs" when EVERYTHING I looked up today had "hr" as the country language code.

I'm not home so can't test presently but will report back if it works when I try again tomorrow.

Many thanks for your help and assistance here Bill, it is greatly appreciated :)

Lynne