Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
postgresql: encoding vs. locale
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
pactoo
Guru
Guru


Joined: 18 Jul 2004
Posts: 553

PostPosted: Mon Jan 21, 2008 11:50 am    Post subject: postgresql: encoding vs. locale Reply with quote

When creating a database, I can choose to set the locale and the encoding. Now, what is the difference between those two settings and which effect may these have for operating ?

Anyone with a bit of insight, please?
Back to top
View user's profile Send private message
YuriyRusinov
Apprentice
Apprentice


Joined: 21 Jul 2004
Posts: 208
Location: Saint-Petersburg, Russia

PostPosted: Mon Jan 21, 2008 2:13 pm    Post subject: Reply with quote

Hello !

Locale option means default locale for database cluster, encoding option is character encoding for default database. Incorrect locale can results in that function like will not work correctly. e.g. if locale is KOI8-R, but database encoding is UTF8 then functions like/ilike for Russian letters does not work.
_________________
Best regards,
Yuriy Rusinov.
Back to top
View user's profile Send private message
pactoo
Guru
Guru


Joined: 18 Jul 2004
Posts: 553

PostPosted: Tue Jan 22, 2008 8:28 am    Post subject: Reply with quote

Thanks for your reply, but that does not make it any more clear to me at all. And from what got, I am not qute sure wether this is correct at all, as, with two exceptions, you are able to choose the locale for each database individually. Same is true for encoding.

When creating a cluster, you currently only fix LC_COLLATE und LC_CTYPE, which cannot be changed afterwards. The remaining locales can be freely defined for each new database.

That does however not explain what practical consequence those decisions have. What do I need the locale for and what the encoding.

What is the difference if I define a database with locale=de_DE or locale=C (is locale=C the same as specifying --no-locale during initdb?)?
And if I define, lets say, locale de_DE, how would the database behaviour change, if that same localized database was created with either an ISO8859 or an UNICODE encoding?

The only thing I have found out, is that LC_COLLATE affects the ordering of (search?) results. Now, if I only use php based web applications - some english, some translated - does this matter at all? And, more important, if, or if not, then why?
Back to top
View user's profile Send private message
YuriyRusinov
Apprentice
Apprentice


Joined: 21 Jul 2004
Posts: 208
Location: Saint-Petersburg, Russia

PostPosted: Tue Jan 22, 2008 9:32 am    Post subject: Reply with quote

If you call something like this
Code:
initdb --encoding=<your_encoding>  --locale=<your_locale>
, then locale is default encoding for database cluster, and encoding is individual encoding for template database, if you create database you can define individual encoding for this database but cannot redefine locale for database cluster.
_________________
Best regards,
Yuriy Rusinov.
Back to top
View user's profile Send private message
pactoo
Guru
Guru


Joined: 18 Jul 2004
Posts: 553

PostPosted: Tue Jan 22, 2008 1:47 pm    Post subject: Reply with quote

Well, I know that I cannot change the locale for the cluster. But that does not matter here and was never the question. My question aimed at the actual databases, since that is, what my applications are working with.

How does encoding and locale affect the daily operation of a database.

To put it in a different way: I am stumbling over following sentence in the postgresql documentation:
Code:

One way to use multiple encodings safely is to set the locale to C or POSIX during initdb, thus disabling any real locale awareness.


What is the difference from an apllication point of view - be it the psql command or some php webapp - between running a database with --no-locale or --locale=xx_XX.UTF8.

And, what difference would it make - again from the application point of view - if each of those databases metioned above are stored in either an ISO8859-15 encoded or an UTF8 encoded Database.
Back to top
View user's profile Send private message
YuriyRusinov
Apprentice
Apprentice


Joined: 21 Jul 2004
Posts: 208
Location: Saint-Petersburg, Russia

PostPosted: Tue Jan 22, 2008 3:28 pm    Post subject: Reply with quote

Quote:
When creating a database, I can choose to set the locale and the encoding

if you create database both using createdb script and "CREATE DATABASE" command, you can choose encoding only.
Quote:
And if I define, lets say, locale de_DE, how would the database behaviour change, if that same localized database was created with either an ISO8859 or an UNICODE encoding?

I don't works deep on this, but my suppose is that database string-functions with either ISO8859 can produce incorrect results with specific de_DE symbols, because in general these symbols cannot be transform into another locale. But behavior of database depends on locale of database cluster.
_________________
Best regards,
Yuriy Rusinov.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum