Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Saving files as utf-8
View unanswered posts
View posts from last 24 hours
View posts from last 7 days

 
Reply to topic    Gentoo Forums Forum Index Desktop Environments
View previous topic :: View next topic  
Author Message
fishbone
n00b
n00b


Joined: 20 Mar 2004
Posts: 52
Location: Bergen, Norway

PostPosted: Thu Dec 08, 2005 9:28 pm    Post subject: Saving files as utf-8 Reply with quote

Hi,

I'm going nuts over this localisation stuff.

My system works pretty well with Danish characters. I can pretty much use special Danish characters in file names, console, applications etc.

However, when I edit files for web-use, I'm in trouble. If I open up a file in any editor (say, Gvim or Gnome's own text editor) I can edit the file and everything is well. If I upload the file to a webserver and open up the URL in a browser, special Danish characters are garbled. Why?

My locale settings are:
Code:
LANG=da_DK.UTF8
LC_CTYPE="en_DK.UTF8"
LC_NUMERIC="en_DK.UTF8"
LC_TIME="en_DK.UTF8"
LC_COLLATE="en_DK.UTF8"
LC_MONETARY="en_DK.UTF8"
LC_MESSAGES="en_DK.UTF8"
LC_PAPER="en_DK.UTF8"
LC_NAME="en_DK.UTF8"
LC_ADDRESS="en_DK.UTF8"
LC_TELEPHONE="en_DK.UTF8"
LC_MEASUREMENT="en_DK.UTF8"
LC_IDENTIFICATION="en_DK.UTF8"
LC_ALL=en_DK.UTF8


Any other information needed? Does anybody have a clue to this?
_________________
--
fishbone
Back to top
View user's profile Send private message
plasmagunman
l33t
l33t


Joined: 07 Jun 2002
Posts: 604
Location: berlin

PostPosted: Thu Dec 08, 2005 11:45 pm    Post subject: Reply with quote


  1. assure that the file was really saved in utf-8, "file -bi <filename>" should tell you the used character set
  2. if you have ssh-access to the server do the same with the file there, assuring nothing changed the encoding while the transfer (don't know if that ever happens...)
  3. check if the character encoding of your browser is set to utf-8, in firefox that's in the "view"-menu

if all these tests are positive i can't help you, sorry.
_________________
please, feel free to correct my english. - por favor, corrige mi español.
Back to top
View user's profile Send private message
LemurFromTheId
n00b
n00b


Joined: 22 Apr 2005
Posts: 64
Location: Finland

PostPosted: Fri Dec 09, 2005 12:41 am    Post subject: Reply with quote

The special Danish characters are most likely garbled because your browser doesn't identify the page's encoding as UTF-8 and falls back to it's or server's default encoding, usually ISO-8859-1 or Windows-1252. (You can confirm this in your browser: in Opera check the Info panel -> Encoding from server - it also tells you if no encoding is supplied at all - and in Firefox right click -> View page info or something... ) There might be two possible reasons:

1) Your web server explicitly tells your browser to use some other encoding.

2) Your page doesn't provide encoding information. If you're playing with HTML, try something like <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> in the Head section. For PHP you need to have this in the very top of the page: <? header("Content-Type: text/html; charset=UTF-8"); ?>.
Back to top
View user's profile Send private message
fishbone
n00b
n00b


Joined: 20 Mar 2004
Posts: 52
Location: Bergen, Norway

PostPosted: Wed Dec 14, 2005 8:57 pm    Post subject: Reply with quote

plasmagunman wrote:

  1. check if the character encoding of your browser is set to utf-8, in firefox that's in the "view"-menu



That was it. I am a moron. Apart from OpenOffice everything was working fine and I wasn't. Setting LINGUAS to "da en" in /etc/make.conf, adding UTF to my use-flags and reemerging (updating) OpenOffice also fixed that so now I'm 100% utf-8 :)
_________________
--
fishbone
Back to top
View user's profile Send private message
Cintra
Advocate
Advocate


Joined: 03 Apr 2004
Posts: 2111
Location: Norway

PostPosted: Thu Dec 15, 2005 11:56 am    Post subject: Reply with quote

Hei fishbone

I'm thinking of experimenting with your en_DK.utf8 setup, since there isn't an en_NO.utf8, and the difference between our keyboards is negligible..

I wonder what you have in your /etc/locales.build ?

Mvh
_________________
"I am not bound to please thee with my answers" W.S.
Back to top
View user's profile Send private message
fishbone
n00b
n00b


Joined: 20 Mar 2004
Posts: 52
Location: Bergen, Norway

PostPosted: Thu Dec 15, 2005 6:33 pm    Post subject: Reply with quote

All I have is:
Code:
en_US/ISO-8859-1
en_US.UTF-8/UTF-8
ja_JP.EUC-JP/EUC-JP
ja_JP.UTF-8/UTF-8
ja_JP/EUC-JP
en_HK/ISO-8859-1
en_PH/ISO-8859-1
de_DE/ISO-8859-1
de_DE@euro/ISO-8859-15
es_MX/ISO-8859-1
fa_IR/UTF-8
fr_FR/ISO-8859-1
fr_FR@euro/ISO-8859-15
it_IT/ISO-8859-1


And output from 'locale':
Code:
locale -a | grep DK
da_DK
da_DK.utf8
en_DK
en_DK.utf8


However, take a look at http://www.gentoo.org/doc/en/utf-8.xml to see if you can make a en_NO.utf8 yourself. Good luck.
_________________
--
fishbone
Back to top
View user's profile Send private message
Cintra
Advocate
Advocate


Joined: 03 Apr 2004
Posts: 2111
Location: Norway

PostPosted: Thu Dec 15, 2005 6:38 pm    Post subject: Reply with quote

Thanks.. I'm in the middle of re-emerging glibc right now ;-)
Mvh

Edit - its looking good, but I do get a warning when opening tagtool:
Code:
# tagtool
(tagtool:29411): Gdk-WARNING **: locale not supported by Xlib
(tagtool:29411): Gdk-WARNING **: cannot set locale modifiers
Qt: Locales not supported on X server

this is referred to in https://forums.gentoo.org/viewtopic.php?t=166984& and sure enough there is no sign of en_DK in either /usr/X11R6/lib/X11/locale/locale.alias or .../locale.dir - hmm! what did you do about that?
_________________
"I am not bound to please thee with my answers" W.S.
Back to top
View user's profile Send private message
fishbone
n00b
n00b


Joined: 20 Mar 2004
Posts: 52
Location: Bergen, Norway

PostPosted: Thu Dec 15, 2005 7:57 pm    Post subject: Reply with quote

Hmm... I get the same error every time I start an application through a terminal - but I haven't done anything about it and everything seems to work fine.

However, looking into stuff shows that I haven't compiled glibc with the userlocales USE-flag. I wonder if this has something to say. Did you include that flag before emerging glibc?

I'll look some more...
_________________
--
fishbone
Back to top
View user's profile Send private message
Cintra
Advocate
Advocate


Joined: 03 Apr 2004
Posts: 2111
Location: Norway

PostPosted: Thu Dec 15, 2005 8:04 pm    Post subject: Reply with quote

Yes, but as you say, it doesn't seem to do any damage and at least I'm able to produce genuine utf-8 files now, so I'm not madly worried. I put a post at the end of the UTF-8 Howto thread about this, so hopefully someone has some ideas, though I have a feeling this requires 'upstream' attention..

Its been a good day anyway ;-)
Thanks for your help

Edit: I went back to LC_ALL="en_US.utf8" ripped the Portugese tracks afresh with audiocd:/ - the new files display accented characters correctly & tagtool sees both the files and tags correctly. What a game.. ;-)
_________________
"I am not bound to please thee with my answers" W.S.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Desktop Environments All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum