View previous topic :: View next topic |
Author |
Message |
FrancoisVal Tux's lil' helper
Joined: 12 May 2005 Posts: 82 Location: Namur, Belgique
|
Posted: Sun Feb 05, 2006 2:24 pm Post subject: man and utf8: problem with accents |
|
|
Hello everybody,
I have switched to utf8 on gentoo. Everything seems to work correctly except man pages. I have read the guide of the page http://www.gentoo.org/doc/en/utf-8.xml and modified /etc/man.conf as suggested but it doesn't worked. In console or in a raphic terminal, the result is the same. In fact, all accents and other typical french characters are not displayed correctly.
Does anybody know a solution to the problem ?
Thanks for your help,
François Valenduc _________________ François Valenduc |
|
Back to top |
|
|
chrroessner Apprentice
Joined: 02 Dec 2003 Posts: 156 Location: Germany
|
Posted: Sun Feb 05, 2006 3:50 pm Post subject: |
|
|
Check /etc/env.d/70less for the LESSCHARSET option (like in the doc described). env-update and check the /etc/man.conf NROFF thing. This should fix your problems, I hope.
Rössi |
|
Back to top |
|
|
FrancoisVal Tux's lil' helper
Joined: 12 May 2005 Posts: 82 Location: Namur, Belgique
|
Posted: Sun Feb 05, 2006 4:58 pm Post subject: |
|
|
Thanks for your answer.
Unfortunately, that didn't solve the problem. I had already a line export LESSCHARSET=utf-8 in the ZSH config file (/etc/zsh/zshenv) and that doesn' work better. I had read in the Howto that ZSH didn't provide support for utf8 but it doesn't seem to be true anymore. I manage to display and manage utf8 characters correctly with ZSH (in console or in a grahical terminal). I have also tried with bash but I still have problems with manpages. Is it really possible to use man and utf8 correctly when you are native french speaker and want to read french with accents or other charaters (like é, è, à, ç...) ? All these characters are not reproduced correctly in man pages. For example, for "é", I get "é". _________________ François Valenduc |
|
Back to top |
|
|
ecatmur Advocate
Joined: 20 Oct 2003 Posts: 3595 Location: Edinburgh
|
|
Back to top |
|
|
FrancoisVal Tux's lil' helper
Joined: 12 May 2005 Posts: 82 Location: Namur, Belgique
|
Posted: Mon Feb 06, 2006 9:10 pm Post subject: |
|
|
I have tested and when I type echo $'\xc3\xa9', I get "é" in konsole, xterm or in a text console. If I use bash or zsh, the "é" is displayed correctly. When I use less to view a file, french characters are displayed correctly (in all the situations listed here above). So the problem is really due to man, groff or nroff. As explained in the howto, I have put the following line in the /etc/man.conf:
NROFF /usr/bin/nroff -mandoc -c
Is there a minimal version of man or groff required to use UTF8 encoding correctly ?
Thanks for your help _________________ François Valenduc |
|
Back to top |
|
|
ecatmur Advocate
Joined: 20 Oct 2003 Posts: 3595 Location: Edinburgh
|
Posted: Mon Feb 06, 2006 11:24 pm Post subject: |
|
|
The comments in man.conf suggest that in some circumstances nroff will "double convert" to utf8 i.e. it interprets a utf8 stream as iso8859 (latin1) and applies the latin1->utf8 conversion. That would give the results you are experiencing.
Try "/usr/bin/nroff -c -mandoc -Tlatin1". _________________ No more cruft
dep: Revdeps that work
Using command-line ACCEPT_KEYWORDS? |
|
Back to top |
|
|
neysx Retired Dev
Joined: 27 Jan 2003 Posts: 795
|
Posted: Tue Feb 07, 2006 11:37 am Post subject: |
|
|
With a fr_FR.utf8 locale and /usr/bin/nroff -c -mandoc -Tlatin1 in man.conf, it almost works but not quite as some characters are still borked. With /usr/bin/nroff -mandoc -Tlatin1 (remove -c), underlined é are displayed properly, but all à are show as <C3>.
Strangely enough, /usr/share/man/fr/man1/man.1.gz is utf-8 encoded. I am not familiar with man pages and *roff stuff, but it looks like there's some conversion to another encoding (latin?) and then another one back to utf8, and some chars are lost in the process.
If you recode /usr/share/man/fr/man1/man.1.gz to latin: Code: | gunzip /usr/share/man/fr/man1/man.1.gz
recode u8..l9 /usr/share/man/fr/man1/man.1 | and use /usr/bin/nroff -mandoc in man.conf, then man man is properly displayed.
Keeping the same setup, I tried man nano. Guess what, it works. Why? /usr/share/man/fr/man1/nano.1.gz is latin9-encoded.
The bottom-line is it appears not all man pages share the same encoding.
Hth |
|
Back to top |
|
|
FrancoisVal Tux's lil' helper
Joined: 12 May 2005 Posts: 82 Location: Namur, Belgique
|
Posted: Tue Feb 07, 2006 12:23 pm Post subject: |
|
|
So it seems that different man pages have different character encoding. If a man page is encoded with latin9, using nroff -mandoc works even if the selected locale is fr_BE.utf-8 ? That looks me quite strange. Should't we open a bug on bugzilla to ask wheter all man pages should be encoded with a similar encoding ?
I will check on my computer when I am back from my work and see wheter I can make some progress.
Thanks for your help _________________ François Valenduc |
|
Back to top |
|
|
FrancoisVal Tux's lil' helper
Joined: 12 May 2005 Posts: 82 Location: Namur, Belgique
|
Posted: Wed Feb 08, 2006 9:50 am Post subject: |
|
|
Indeed, converting the man pages of man to latin9 and using nroff -mandoc works for man and noano. However, I tried the trick with other man pages (ls, mount for example) and I didn't manage to display these pages correctly. Furthermore, if I have to recode all man pages into a suitable encoding, it is going to be very tiresome. I would like to find an easier solution (and not going back to iso8859-1), if possible !
Thanks for your help _________________ François Valenduc |
|
Back to top |
|
|
buzz22 n00b
Joined: 12 Apr 2006 Posts: 1
|
Posted: Wed Apr 12, 2006 2:55 pm Post subject: |
|
|
Take a look at http://www.haible.de/bruno/packages-groff-utf8.html. It's a groff extension that allows to view UTF-8 encoded man pages.
It works for me. Just replace Code: | NROFF /usr/bin/nroff -c -mandoc | by Code: | NROFF /usr/bin/groff-utf8 -Tutf8 -mandoc | in your /etc/man.conf.
Hope it can help you,
Laurent |
|
Back to top |
|
|
Dominique_71 Veteran
Joined: 17 Aug 2005 Posts: 1877 Location: Switzerland (Romandie)
|
Posted: Mon Nov 18, 2013 8:54 am Post subject: |
|
|
buzz22 wrote: | Take a look at http://www.haible.de/bruno/packages-groff-utf8.html. It's a groff extension that allows to view UTF-8 encoded man pages.
It works for me. Just replace Code: | NROFF /usr/bin/nroff -c -mandoc | by Code: | NROFF /usr/bin/groff-utf8 -Tutf8 -mandoc | in your /etc/man.conf.
Hope it can help you,
Laurent |
Thanks Laurent, it worked. After trying every thing else, I am now able to get the correct characters. It screw up my custom less colour scheme, but at least the characters are the good ones. _________________ "Confirm You are a robot." - the singularity |
|
Back to top |
|
|
grumblebear Apprentice
Joined: 26 Feb 2008 Posts: 202
|
Posted: Tue Nov 14, 2017 12:06 pm Post subject: |
|
|
Sorry for waking up an old thread. But after years of ignoring the garbage displayed in localized man pages on a fully UTF-8 configured Gentoo, I have now found the correct solution.
The line in /etc/man.conf should read
Code: | NROFF /usr/bin/preconv | /usr/bin/nroff -mandoc |
Maybe someone can update the wiki at https://wiki.gentoo.org/wiki/UTF-8.
Or perhaps a bug should be filed for sys-apps/man.
Edit:
This solution garbles man pages that are not utf8-encoded.
Conclusion: Better use sys-apps/man-db and get rid of sys-apps/man. |
|
Back to top |
|
|
mike155 Advocate
Joined: 17 Sep 2010 Posts: 4438 Location: Frankfurt, Germany
|
Posted: Tue Nov 14, 2017 3:42 pm Post subject: |
|
|
Hi grumblebear, thanks for sharing this! This is indeed the right solution and it works.
I digged a little deeper into this. We live in 2017 and groff still doesn't support utf8 encoded input streams. What a shame! preconv does the right thing. It tries to detect the encoding of the input stream. If it detects utf8, it escapes all non-ASCII characters so that groff can't garble them.
There are other solutions. groff-utf8, for example. But I think the preconv-solution is currently the best we can do. |
|
Back to top |
|
|
charles17 Advocate
Joined: 02 Mar 2008 Posts: 3664
|
Posted: Tue Nov 14, 2017 5:19 pm Post subject: |
|
|
What's wrong with »This is only needed when sys-apps/man is used instead of sys-apps/man-db.«? |
|
Back to top |
|
|
grumblebear Apprentice
Joined: 26 Feb 2008 Posts: 202
|
Posted: Tue Nov 14, 2017 8:48 pm Post subject: |
|
|
charles17 wrote: |
What's wrong with »This is only needed when sys-apps/man is used instead of sys-apps/man-db.«? |
That sentence is absolutely right. Only the following code block should be updated. Now in 2017 nearly all man pages are utf8-encoded, so it is best to put the preconv step into the NROFF command. |
|
Back to top |
|
|
charles17 Advocate
Joined: 02 Mar 2008 Posts: 3664
|
Posted: Wed Nov 15, 2017 7:47 am Post subject: |
|
|
grumblebear wrote: | That sentence is absolutely right. Only the following code block should be updated. ... |
Feel free to do so. Everybody is allowed to contribute to Gentoo wiki. |
|
Back to top |
|
|
|