Joined: 12 Aug 2003
Location: Silver City, NM
|Posted: Fri Aug 30, 2013 6:26 am Post subject: Unicode support broken in zsh-5.0.2 and zsh-5.0.2-r3
|I'm building a new system and installed the lastest zsh from portage. I noticed that zsh no longer recognizes unicode when determining the length of a string. It now counts bytes instead of characters. Also, I can no longer copy-and-paste unicode characters into the zsh command line. This all works fine with Bash on the new system and also with zsh-4.3.17 (which is no longer in Portage) on an older system.
To test the problem I made a one line script:This is equivalent to:
in both Bash and Zsh.
I can source this script in zsh and echo $hbar shows the unicode character as expected. But if I run:it says "3" when it should say "1". Both Bash and the older zsh correctly say "1".
Here are the package settings:
app-shells/zsh-5.0.2-r3 was built with the following:
USE="gdbm pcre unicode -caps -debug -doc -examples -maildir -static"
These are the same USE flags that were used for the working zsh-4.3.17. I'd be glad to post the full emerge --info if anyone thinks that might help.
I set LANG to en_US.UTF-8 in /etc/env.d/02locale and ran env-update and source /etc/profile. The locale command confirms that is what LANG is set to. I don't know what else to do to get zsh to deal with unicode correctly. I found other unicode problems in the new zsh but this one was the easiest to explain. Those problems also had zsh counting bytes instead of characters.
I have worked around a similar problem in the busybox ash shell by using sed to convert all characters to ascii "x" before taking the length:
|dummy=$(echo $hbar | sed 's/./x/g')
Oddly enough when I try to run this in zsh on the new system, dummy is equal to "xxx" while with Bash on the new system and Zsh on the old system and even busybox Ash, the dummy variable equals "x". BTW the busybox sed does not handle this correctly. I've been using the standard /bin/sed even with the busybox shell.