Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Unicode support broken in zsh-5.0.2 and zsh-5.0.2-r3
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2429
Location: Silver City, NM

PostPosted: Fri Aug 30, 2013 6:26 am    Post subject: Unicode support broken in zsh-5.0.2 and zsh-5.0.2-r3 Reply with quote

I'm building a new system and installed the lastest zsh from portage. I noticed that zsh no longer recognizes unicode when determining the length of a string. It now counts bytes instead of characters. Also, I can no longer copy-and-paste unicode characters into the zsh command line. This all works fine with Bash on the new system and also with zsh-4.3.17 (which is no longer in Portage) on an older system.

To test the problem I made a one line script:
Code:
hbar="─"
This is equivalent to:
Code:
hbar=$'\xe2\x94\x80'
in both Bash and Zsh.

I can source this script in zsh and echo $hbar shows the unicode character as expected. But if I run:
Code:
echo ${#hbar}
it says "3" when it should say "1". Both Bash and the older zsh correctly say "1".

Here are the package settings:
Code:
=================================================================
                        Package Settings
=================================================================

app-shells/zsh-5.0.2-r3 was built with the following:
USE="gdbm pcre unicode -caps -debug -doc -examples -maildir -static"

These are the same USE flags that were used for the working zsh-4.3.17. I'd be glad to post the full emerge --info if anyone thinks that might help.

I set LANG to en_US.UTF-8 in /etc/env.d/02locale and ran env-update and source /etc/profile. The locale command confirms that is what LANG is set to. I don't know what else to do to get zsh to deal with unicode correctly. I found other unicode problems in the new zsh but this one was the easiest to explain. Those problems also had zsh counting bytes instead of characters.

I have worked around a similar problem in the busybox ash shell by using sed to convert all characters to ascii "x" before taking the length:
Code:
dummy=$(echo $hbar | sed 's/./x/g')
echo ${#dummy}

Oddly enough when I try to run this in zsh on the new system, dummy is equal to "xxx" while with Bash on the new system and Zsh on the old system and even busybox Ash, the dummy variable equals "x". BTW the busybox sed does not handle this correctly. I've been using the standard /bin/sed even with the busybox shell.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum