Czech support in MySQL
Czech collation support for MySQL 3.23
Current versions of MySQL (3.23+) contain support for Czech collation approximating Czech standard. The support has to be compiled in the server.
To set the Czech collation as the default one, compile the server
with ./configure
's option --with-charset=czech
.
In this case, all sortings on character columns will use Czech rules.
When a compile-time option
./configure --with-extra-charsets=all
is used, server will
support multiple character sets and collations and actual variant can
be set upon server startup, with run-time parameter
--default-character-set=czech
. The default is again given
by the parameter --with-charset
.
When you change the character set used by the server, indexes have to
be regenerated, see
Chapter
The Character Set Used for Data and Sorting in MySQL documentation.
Server parameters can also be specified in configuration file, typically
in /etc/my.cnf
, using
[mysqld] default-character-set=czech
This Czech collation table implements case sensitive order of letters. MySQL manual talks about case insensitivity but that only holds in the default (Latin1) situation.
Character sets
Sorting uses the ISO-8859-2 character set. It your data on the client
side is in
character set Windows-1250 (often, people will realize this when
words with letters š and ž get sorted incorrectly
--- ISO-8859-2 and Windows-1250 are similar but not exactly the
same), on-line translation of character sets between the server and
the client can be set. Server has to have this feature compiled in,
the easiest way is to remove comment in file sql/convert.cc
/* #define DEFINE_ALL_CHARACTER_SETS */
before compilation. Then, in the client, issue command
SET CHARACTER SET cp1250_latin2
Client will work with data in Windows-1250 and server will store it in ISO-8859-2.
Server messages
The MySQL distribution contains message catalogue translated to Czech,
translated messages will get turned on at server start-up with parameter
--language=czech
.
Support for half-Czech collation in MySQL in Windows-1250
The distribution mysql-3.23.42-win1250ch-1.tar.gz
contains file
strings/ctype-win1250ch.c
,
that implements simpler two-pass sorting similar to the Czech one. In
this collation, ch is sorted correctly but the primary ordering is case
insensitive and is in the Windows-1250 character set. Included in
the distribution are also patches of the
Configure.in
and sql/share/charsets/Index
.
Support for nearly complete UCA UTF-8 collation in MySQL
The distribution mysql-3.23.42-utf8adnocase.tar.gz contains support for ordering based on UCA algorithm. Included are two algorithms, case sensitive and case insensitive.
Both (collation) character set win1250ch and utf8ad should also be coming included with all current versions of MySQL by now.
Stripping diacritics in MySQL (il2 to ascii)
I wrote functions that can be used in MySQL to convert text to plain ASCII, and conversions between ISO-8859-2 and Windows-1250. The distribution is called udf_charsets-1.0.tar.gz, and also contains README with installation instructions.
Author
Copyright: (c) 1998--2001 Jan Pazdziora. All rights reserved. This package is free software; you can redistribute it and/or modify it under the terms of either GPL or Artistic Licence, whichever you like more.