关于 Linux 下的 Locale,网上讲这个资料不少,本没必要多说什么,但是我感觉还是有必要总结一下哟 ^_^ 。

本篇文章使用的操作环境为 CentOS 6.5

1. 查看当前 Locale

[chenhj@node1 ~]$ locale
LANG=en_US.UTF-8
LC_CTYPE="zh_CN.UTF8"
LC_NUMERIC="zh_CN.UTF8"
LC_TIME="zh_CN.UTF8"
LC_COLLATE="zh_CN.UTF8"
LC_MONETARY="zh_CN.UTF8"
LC_MESSAGES="zh_CN.UTF8"
LC_PAPER="zh_CN.UTF8"
LC_NAME="zh_CN.UTF8"
LC_ADDRESS="zh_CN.UTF8"
LC_TELEPHONE="zh_CN.UTF8"
LC_MEASUREMENT="zh_CN.UTF8"
LC_IDENTIFICATION="zh_CN.UTF8"
LC_ALL=zh_CN.UTF8

上面列出了系统支持各种区域相关的属性,比如:日期、货币。

2. 设置 Locale

通过设置环境变量可以随时改变 Locale:

LC_ALL > LC_* > LANG
[root@node1 ~]# export LC_ALL=zh_CN.utf8

或者修改:

/etc/sysconfig/i18n
[root@node1 ~]# cat /etc/sysconfig/i18n
LANG="en_US.UTF-8"
SYSFONT="latarcyrheb-sun16"

3. 查看系统支持的 Loacle 一览

如果需要列出所有zh_CN的相关内容:

[chenhj@node1 ~]$ locale -a|grep zh_CN
zh_CN
zh_CN.gb18030
zh_CN.gb2312
zh_CN.gbk
zh_CN.utf8

4. Locale 的定义在哪里?

Locale 的定义位置在 /usr/share/i18n/locales 目录。

例如:
zh_CN 文件的定义位置在:

/usr/share/i18n/locales/zh_CN
内容节选如下:
...

LC_CTYPE
% This is a copy of the "i18n" LC_CTYPE with the following modifications:
% - Additional classes: hanzi

copy "i18n"

translit_start
include "translit_combining";""
translit_end

class "hanzi"; /
% <U3400>..<U4DBF>;/
        <U4E00>..<U9FA5>;/
        <UF92C>;<UF979>;<UF995>;<UF9E7>;<UF9F1>;<UFA0C>;<UFA0D>;<UFA0E>;/
        <UFA0F>;<UFA11>;<UFA13>;<UFA14>;<UFA18>;<UFA1F>;<UFA20>;<UFA21>;/
        <UFA23>;<UFA24>;<UFA27>;<UFA28>;<UFA29>
END LC_CTYPE

% ISO 14651 collation sequence
LC_COLLATE
copy "iso14651_t1_pinyin"
END LC_COLLATE

...

上面的 LC_CTYPE 定义了简体中文的汉字分类 “hanzi” , LC_COLLATE 定义了汉字的拼音排序。

还可以再打开拼音排序的定义文件看看:

/usr/share/i18n/locales/iso14651_t1_pinyin

内容节选如下:

LC_COLLATE

copy "iso14651_t1_common"

script <HAN>

order_start <HAN>;forward;forward;forward;forward,position
<U5416> <U5416>;IGNORE;IGNORE;IGNORE #吖104
<U814C> <U814C>;IGNORE;IGNORE;IGNORE #腌185
<U9312> <U9312>;IGNORE;IGNORE;IGNORE #錒0
<U9515> <U9515>;IGNORE;IGNORE;IGNORE #锕7
<U963F> <U963F>;IGNORE;IGNORE;IGNORE #阿23237
<U55C4> <U55C4>;IGNORE;IGNORE;IGNORE #嗄60
<U554A> <U554A>;IGNORE;IGNORE;IGNORE #啊16566
<U54C0> <U54C0>;IGNORE;IGNORE;IGNORE #哀4070
<U54CE> <U54CE>;IGNORE;IGNORE;IGNORE #哎2473
...

一看注释就明白了,确实是按拼音排序的。

5. 字符集在哪定义的?

字符集都定义在 /usr/share/i18n/charmaps 目录下。

例如:
GB2312文件的定义位置在:

/usr/share/i18n/charmaps/GB2312.gz

内容节选如下:

<code_set_name> GB2312
<mb_cur_max> 2
<mb_cur_min> 1
<comment_char> %
<escape_char> /
% Chinese charmap for EUC-CN = GB2312 = union of ASCII and GB_2312-80
% version: 1.0
% Contact: ha_shao
% Email: hashao@china.com
% Distribution and use is free, even for comercial purpose.
%
CHARMAP
<U0000> /x00 NULL (NUL)
<U0001> /x01 START OF HEADING (SOH)
<U0002> /x02 START OF TEXT (STX)
<U0003> /x03 END OF TEXT (ETX)
<U0004> /x04 END OF TRANSMISSION (EOT)
<U0005> /x05 ENQUIRY (ENQ)
<U0006> /x06 ACKNOWLEDGE (ACK)
<U0007> /x07 BELL (BEL)
<U0008> /x08 BACKSPACE (BS)
<U0009> /x09 CHARACTER TABULATION (HT)
...

6. 创建 Loacle

前面提到的 Loacle文件定义和字符集定义都相当于源代码,我们真正使用是基于 Loacle文件定义+字符集定义 编译得到的 Locale 文件,该Locale会被添加到 local-archive 文件集合中。

创建 Loacle 使用 localedef :

man localedef

内容节选如下:

...
The localedef program reads the indicated charmap and input files, compiles them to a form usable by the locale(7) functions inthe C library, and places the six output files in the outputpath directory.
...

我们下面定义一个 locale-archive 试试吧:

[root@node1 ~]# localedef -f UTF-8 -i zh_CN myzh
[root@node1 ~]# locale -a | grep myzh
myzh
myzh.utf8

创建的 locale 都会被添加进 locale-archive文件,locale-archive文件的位置:/usr/lib/locale/locale-archive

[root@node1 ~]# grep myzh /usr/lib/locale/locale-archive
Binary file /usr/lib/locale/locale-archive matches

7. 获取本地化消息

[root@node1 ~]# export LC_ALL=myzh.utf8
[root@node1 ~]# ls xx
ls: cannot access xx: No such file or directory

怎么还是英文消息?
看看它在干嘛!

[root@node1 ~]# strace -eopen ls xx
open("/etc/ld.so.cache", O_RDONLY) = 3
open("/lib64/libselinux.so.1", O_RDONLY) = 3
open("/lib64/librt.so.1", O_RDONLY) = 3
open("/lib64/libcap.so.2", O_RDONLY) = 3
open("/lib64/libacl.so.1", O_RDONLY) = 3
open("/lib64/libc.so.6", O_RDONLY) = 3
open("/lib64/libdl.so.2", O_RDONLY) = 3
open("/lib64/libpthread.so.0", O_RDONLY) = 3
open("/lib64/libattr.so.1", O_RDONLY) = 3
open("/proc/filesystems", O_RDONLY) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
open("/usr/share/locale/myzh.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/myzh/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
ls: cannot access xxopen("/usr/share/locale/myzh.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/myzh/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
: No such file or directory

原来找不到libc的本地消息资源文件。
localedef 只是定义了 Locale 的基本内容,每个应用要使用的本地资源还得另外添加
现在从别的地方先借个过来应急!

[root@node1 ~]# ls /usr/share/locale/myzh
ls: cannot access /usr/share/locale/myzh: No such file or directory
#添加一个链接:
[root@node1 ~]# ln -sf /usr/share/locale/zh_CN /usr/share/locale/myzh

再试一下,OK了。

[root@node1 ~]# ls xx
ls: 无法访问xx: 没有那个文件或目录

因为上面的myzh字体文件是我演示用的,所以最后把这个临时的locale再删掉:

[root@node1 ~]# localedef --delete-from-archive myzh
[root@node1 ~]# rm -f /usr/share/locale/myzh

作者:skykiker
http://blog.chinaunix.net/uid-20726500-id-4662320.html