View previous topic :: View next topic |
Should everyone use UTF8 for posting in the forums? |
Of course! |
|
81% |
[ 161 ] |
I think everyone should choose the encoding he likes |
|
11% |
[ 23 ] |
What is UTF8? |
|
6% |
[ 13 ] |
|
Total Votes : 197 |
|
Author |
Message |
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Thu Dec 09, 2004 9:11 pm Post subject: Posts in UTF8? |
|
|
I wanted to know if there is any guideline concerning the use of special (non ASCII) characters (used in many languages) in the forums. My opinion to this clearly is that everybody should use UTF8, but I think I can't impose this on everyone.
But it's very frustrating if you browse through the forums and find that there is no identical encoding. This results in muddled up pages which are not really readable as even within one thread there are multiple encodings used.
Of course this is no major issue for all the English forums, but as I like the localized ones too, I encounter this problem all of the time. |
|
Back to top |
|
|
codergeek42 Bodhisattva
Joined: 05 Apr 2004 Posts: 5142 Location: Anaheim, CA (USA)
|
Posted: Thu Dec 09, 2004 9:20 pm Post subject: |
|
|
I would rather see an added HTML-entities button or list for letters like é etc. _________________ ~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF |
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Thu Dec 09, 2004 9:32 pm Post subject: |
|
|
codergeek42 wrote: | I would rather see an added HTML-entities button or list for letters like é etc. |
I hope you are kidding?! You are not really assuming that one should write HTML codes for special characters or select them with the mouse? Of course we could also provide an interface where you select all your characters with the mouse. In that case we wouldn't need any keyboard. Yippiiee
P.S.: Sorry for overdoing this, but I couldn't resist using some irony... |
|
Back to top |
|
|
codergeek42 Bodhisattva
Joined: 05 Apr 2004 Posts: 5142 Location: Anaheim, CA (USA)
|
Posted: Thu Dec 09, 2004 9:37 pm Post subject: |
|
|
Meh. To each their own I guess, but W3C's XHTML recommendation suggests using only ASCII characters and entities for anything non-ASCII, that way the UA (user-agent) can interpret them as needed. /shrug _________________ ~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF |
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Thu Dec 09, 2004 10:16 pm Post subject: |
|
|
codergeek42 wrote: | Meh. To each their own I guess |
Of course. There was no offense meant, I was just being a little bit ironic.
codergeek42 wrote: | but W3C's XHTML recommendation suggests using only ASCII characters and entities for anything non-ASCII, that way the UA (user-agent) can interpret them as needed. /shrug |
well, I like W3C standards but I think unicode can solve this much better. I don't know if you write a lot in languages that use non-ascii characters, but if you do, you know what a pain in the ass this is, if you can't type them in nicely with your keyboard. It's very time-consuming to use some other means of representing them... |
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Thu Dec 09, 2004 10:31 pm Post subject: |
|
|
In fact I added the option "What is UTF8?" only as some kind of a joke (this is a popular kind of joke for poll option, the other one being options with Cowboy Neal - as seen at slashdot.org). But as there are some people who really chose this option, I will tell you what UTF8 is.
From wikipedia:
UTF-8 (8-bit Unicode Transformation Format) is a lossless, variable-length character encoding for Unicode created by Rob Pike and Ken Thompson. It uses groups of bytes to represent the Unicode standard for the alphabets of many of the world's languages.
for more information: http://en.wikipedia.org/wiki/UTF8 |
|
Back to top |
|
|
liuspider Apprentice
Joined: 03 Feb 2003 Posts: 237
|
Posted: Fri Dec 10, 2004 12:50 am Post subject: |
|
|
Answer: Of couse
For Chinese/Korean/Japanese users, UTF-8 is clearly a more friendly encoding: you can not list all the Chinese characters (more than 10000) |
|
Back to top |
|
|
Grahammm Tux's lil' helper
Joined: 01 Sep 2004 Posts: 84 Location: Berkshire UK
|
Posted: Sat Dec 11, 2004 10:52 am Post subject: |
|
|
codergeek42 wrote: | Meh. To each their own I guess, but W3C's XHTML recommendation suggests using only ASCII characters and entities for anything non-ASCII, that way the UA (user-agent) can interpret them as needed. /shrug |
But when entering text, should the forum user have to worry about that? When the browser sends the HTTP POST command to the server it includes (at least mine does) the charset of the data. In my case it sends UTF-8. So would it not be better for the forum software to convert the non-ASCII characters into the appropriate entities? That way the posting should display correctly whatever the browser is set to (if the browser can render that particular character) |
|
Back to top |
|
|
plate Bodhisattva
Joined: 25 Jul 2002 Posts: 1663 Location: Berlin
|
Posted: Sat Dec 11, 2004 12:47 pm Post subject: |
|
|
The forum software does the conversion to plain ASCII for XHTML compliance for you. Enter Japanese in UTF-8, hit the preview button and look at the edit window: 日本語 becomes Code: | & #26085; & #26412; & #35486; |
|
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Sat Dec 11, 2004 4:50 pm Post subject: |
|
|
plate wrote: | The forum software does the conversion to plain ASCII for XHTML compliance for you. Enter Japanese in UTF-8, hit the preview button and look at the edit window: æ¥æ¬èª becomes Code: | & #26085; & #26412; & #35486; |
|
How come then that in some threads (where some people use ISO-8859 and others UTF8) there are non readable symbols, and as soon as you switch the encoding they become readable, but then the ones in the other encoding are not readable anymore? |
|
Back to top |
|
|
Flammie Retired Dev
Joined: 02 Jun 2003 Posts: 633 Location: Dublin, Ireland
|
Posted: Mon Dec 13, 2004 7:37 am Post subject: |
|
|
Hrmph. When it was said in GWN that forums was moved towards UTF-8 for new chinese section I thought it meant the forums were finally moved to use UTF-8, but now I see it still does not indicate any encoding at all in http headers nor in html meta tags. Leaving things for browsers to guess is totally b0rked.
A good example of this is plate's supposedly correct example for Japanese, it seems correct when viewing from thread view, but b0rks in this preview window below the edit box, unless you tweak your browsers settings, and then the ä's and ö's of Finnish user interface break. I don't think there's other viable solutions for making things work everywhere than using utf-8 (unless you want to patch phpBB and it's language packs to use entity references as well).
(Of course now that Chinese came up, the ideal encoding for east asian languages has always been utf-16 not utf-8, but that's too problematic for common use still.)
☠☢☮☹☹☹ |
|
Back to top |
|
|
plate Bodhisattva
Joined: 25 Jul 2002 Posts: 1663 Location: Berlin
|
Posted: Mon Dec 13, 2004 8:04 am Post subject: |
|
|
We know that. It's far from perfect, but there's no common rule for the forums as a whole. None of the European languages use UTF-8, the Russian forum doesn't, the Greek neither. I'd personally favour moving the whole thing to UTF-8, but that's some way down the line. In the meantime, creating the Chinese forum in UTF-8 is the only (practical) way of allowing for both traditional and simplified Chinese, but again, that's just a convention, nothing that's technically binding in any way. You either respect it, or you don't. Chinese users who don't will politely be asked to switch, or consequently have to live with the threat that their posts get moved out of the way, that's basically all we can do at this point. |
|
Back to top |
|
|
fernandotcl Veteran
Joined: 20 Nov 2003 Posts: 1396 Location: Sao Paulo, Brazil
|
Posted: Mon Dec 13, 2004 8:16 pm Post subject: |
|
|
Let the browser autodetect. Do not enforce. _________________ RTFM! |
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Tue Dec 14, 2004 10:12 am Post subject: |
|
|
fernandotcl wrote: | Let the browser autodetect. Do not enforce. |
If there's for example a thread where some people post in ISO-8859 and others in UTF8, this does not help. There will always be some posts which are muddled up, as auto-detect detects ONE encoding, and those posts in the OTHER are not represented correctly. |
|
Back to top |
|
|
Cintra Advocate
Joined: 03 Apr 2004 Posts: 2111 Location: Norway
|
Posted: Tue Dec 14, 2004 2:54 pm Post subject: Re: Posts in UTF8? |
|
|
SPW wrote: | I wanted to know if there is any guideline concerning the use of special (non ASCII) characters (used in many languages) in the forums. My opinion to this clearly is that everybody should use UTF8, but I think I can't impose this on everyone.
But it's very frustrating if you browse through the forums and find that there is no identical encoding. This results in muddled up pages which are not really readable as even within one thread there are multiple encodings used.
Of course this is no major issue for all the English forums, but as I like the localized ones too, I encounter this problem all of the time. |
I agree..
as a frequent user of 'view threads since last visit', the recent addition of subject lines in those most attractive Chinese characters plays havoc with the width of my (normally) Western encoded display, unless viewed with UTF8, but then viewing French or Nordic characters goes out the window.. i.e. there's a problem.
Limiting the width of Chinese topics would seem to be one answer as it only takes one wide subject to ruin the whole page. The only alternative so far is switching back and forward between UTF8 and Western.
Other ideas? _________________ "I am not bound to please thee with my answers" W.S. |
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Wed Dec 22, 2004 10:51 am Post subject: Re: Posts in UTF8? |
|
|
Cintra wrote: | [...] unless viewed with UTF8, but then viewing French or Nordic characters goes out the window.. i.e. there's a problem. [...] Other ideas? |
What about nicely asking the French/Nordic/German/... people to adopt UTF8? |
|
Back to top |
|
|
tubamann n00b
Joined: 23 Jan 2004 Posts: 25 Location: Bergen, Norway
|
Posted: Wed Dec 22, 2004 12:00 pm Post subject: Re: Posts in UTF8? |
|
|
SPW wrote: |
What about nicely asking the French/Nordic/German/... people to adopt UTF8? |
Well, as a norwegian I'm for that. However, everything and everyone is still using iso-8859, so the change would be big, and affect millions. You just don't say "Let's use utf-8 today". _________________ http://tubamann.deviantart.com |
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Wed Dec 22, 2004 12:27 pm Post subject: Re: Posts in UTF8? |
|
|
tubamann wrote: | SPW wrote: |
What about nicely asking the French/Nordic/German/... people to adopt UTF8? |
Well, as a norwegian I'm for that. However, everything and everyone is still using iso-8859, so the change would be big, and affect millions. You just don't say "Let's use utf-8 today". |
Affect millions? Are there millions of them using Gentoo Forums? I'm only talking about using UTF8 in Gentoo forums, that can be done with a simple mouse click in the webbrowser when posting messages. Nobody has to use UTF8 as "system encoding" and change locales and such. |
|
Back to top |
|
|
Cintra Advocate
Joined: 03 Apr 2004 Posts: 2111 Location: Norway
|
Posted: Wed Dec 22, 2004 1:16 pm Post subject: Re: Posts in UTF8? |
|
|
SPW wrote: | tubamann wrote: | SPW wrote: |
What about nicely asking the French/Nordic/German/... people to adopt UTF8? |
Well, as a norwegian I'm for that. However, everything and everyone is still using iso-8859, so the change would be big, and affect millions. You just don't say "Let's use utf-8 today". |
Affect millions? Are there millions of them using Gentoo Forums? I'm only talking about using UTF8 in Gentoo forums, that can be done with a simple mouse click in the webbrowser when posting messages. Nobody has to use UTF8 as "system encoding" and change locales and such. |
Sorry, am I missing something? Click where? _________________ "I am not bound to please thee with my answers" W.S. |
|
Back to top |
|
|
ian! Bodhisattva
Joined: 25 Feb 2003 Posts: 3829 Location: Essen, Germany
|
Posted: Wed Dec 22, 2004 1:25 pm Post subject: |
|
|
We'll have a look at implementing it and converting everything to UTF-8 in the future. But there is a long long way to go and we don't know if it's feasible for all users. So don't expect it to happen within the next weeks. _________________ "To have a successful open source project, you need to be at least somewhat successful at getting along with people." -- Daniel Robbins |
|
Back to top |
|
|
SPW Guru
Joined: 22 Jul 2003 Posts: 318 Location: Lëtzebuerg
|
Posted: Wed Dec 22, 2004 2:17 pm Post subject: Re: Posts in UTF8? |
|
|
Cintra wrote: | Sorry, am I missing something? Click where? |
Well, that depends on the web browser you use but you should have something like: "View => Encoding => UTF8"
This sets your encoding to UTF8 for the current page (you may also have a (semi-)automatic setting which switches encodings for you.
ian! wrote: | We'll have a look at implementing it and converting everything to UTF-8 in the future. But there is a long long way to go and we don't know if it's feasible for all users. So don't expect it to happen within the next weeks. |
Well, it's nice to see that you are aware of the problem and trying to do something against it. Thanks |
|
Back to top |
|
|
Flammie Retired Dev
Joined: 02 Jun 2003 Posts: 633 Location: Dublin, Ireland
|
Posted: Wed Dec 22, 2004 6:20 pm Post subject: Re: Posts in UTF8? |
|
|
SPW wrote: | Cintra wrote: | Sorry, am I missing something? Click where? |
Well, that depends on the web browser you use but you should have something like: "View => Encoding => UTF8"
This sets your encoding to UTF8 for the current page (you may also have a (semi-)automatic setting which switches encodings for you. |
Well, if the change to UTF-8 is correctly implemented on web server side, nobody will ever need to click anywhere.
ian! wrote: | We'll have a look at implementing it and converting everything to UTF-8 in the future. But there is a long long way to go and we don't know if it's feasible for all users. So don't expect it to happen within the next weeks. |
I'm already overly anxious as an UTF-8 advocate
Is there anything that would make it not feasible? Web browsers are among the best software when it comes to UTF-8 support, even netscape 4 series has it. Mysql supports it. Php supports it with an ugly mb kludge... |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20067
|
Posted: Wed Dec 22, 2004 6:38 pm Post subject: |
|
|
I have no need for UTF-8. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
Cintra Advocate
Joined: 03 Apr 2004 Posts: 2111 Location: Norway
|
Posted: Wed Dec 22, 2004 7:33 pm Post subject: Re: Posts in UTF8? |
|
|
SPW wrote: | Cintra wrote: | Sorry, am I missing something? Click where? |
Well, that depends on the web browser you use but you should have something like: "View => Encoding => UTF8"
This sets your encoding to UTF8 for the current page (you may also have a (semi-)automatic setting which switches encodings for you. |
In that case I wasn't missing anything.. I already have UTF-8 set as default both in Firefox and Opera, with Chinese, Norwegian, French, & English in that order in languages setup. That keeps the topic column layout moderately stable.
mvh _________________ "I am not bound to please thee with my answers" W.S. |
|
Back to top |
|
|
Deathwing00 Bodhisattva
Joined: 13 Jun 2003 Posts: 4087 Location: Dresden, Germany
|
Posted: Thu Dec 23, 2004 8:32 pm Post subject: |
|
|
I disagree on this one. ISO specifications are fine to me. BTW, where is the poll option "we don't need UTF-8"? |
|
Back to top |
|
|
|