 |
Linux Internationalization Problems
by Juraj Bednar, in Editorials - Sat, Oct 21st 2000 23:59 UTC
Linux continues its march to the desktop, strengthened by the arrival
of Open Office and other non-hacker applications, but what good are
these apps to you if they don't speak your language? In today's
editorial, Juraj Bednar asks that the community not forget
localization if it wants Linux to be an alternative for the
non-English-speaking world.
Copyright notice: All reader-contributed material on freshmeat.net
is the property and responsibility of its author; for reprint rights, please contact the author
directly.
Many people in English-speaking countries are pushing Linux to the
office. They now have wonderful Open Source office suites like
AbiSuite, KOffice, and Open Office, and some commercial office suites
as well. Everything works just fine for English users (about 508
million people speak English, which is about 12% of the world's
population; the percentage is much higher when counting people
connected to the Internet, but it's still not everyone). There are
some efforts to support very problematic languages (Chinese, Japanese,
Korean) which use different characters to encode their writings, but
others are not receiving all the attention they need.
In this article, I would like to explain the basic issues with Central
European languages and how to avoid making mistakes. The first step
in making Linux "your-language-friendly" is to create a locale for
your language. A locale is a set of definitions of how to represent
and process various data types like time, date, monetary symbols,
special characters, and so on. One of the important parts of a locale
is the so-called message translation definition, a set of files which
define how certain messages are translated to that particular
language. There's usually one such file for an application, a hash
table which contains all the application's messages, so it's generally
the translation of the program's user interface.
The problem is not related to including these locales in certain
distributions (they're part of glibc, and it's quite easy to add to
glibc if you want to). The problem is with setting the locale
parameters for each user. This is what almost no distribution
considers when setting up users. There should not be a system-wide
default, because Linux is a multiuser environment. Each user should be
able to set his own language variables. If he wants to do it now, he
has to edit his .bashrc or similar file to have the proper values
set. This is not very user-friendly.
There is almost no problem with locales and translations, so, in most
distributions, you can see the messages in your language when you set
the correct locale and have the messages installed. Now we want to
type our characters and see them, so we need fonts for
displaying. There are not enough free fonts, but most distributions
include those which are available for each character encoding.
Keyboards are more difficult; there's no general way for users to
configure them. Many distributions with graphical installations used
to read a directory called rulesets, which is outdated and contains
deceiving information (for example, a "Czechoslovakian keyboard" --
complete nonsense, since Czechs and Slovaks use different keyboards
and different characters). There is also the problem of not setting
the correct locale (which causes the keymap to not work
correctly). These are major internationalization problems which are
not so difficult to solve. All it wants is just a bit of good will
from the distribution creators (they can contact me if they want to
discuss something; I really want Linux to be usable in my country and
with my language).
There are also more difficult problems to solve. One is the problem
of locale and switching keyboards. When I went to Norway last year to
visit a friend, I found a problem: I wanted to switch between Slovak,
English, and Norwegian keyboards. Since it's quite easy under some
systems, I thought it would be no problem with Linux. I launched
xkbsel, which switches the keyboards "on the run". The problem was
that the keyboard doesn't work without the correct locale. If I
started xterm with locale set to Slovak, I could not type Norwegian
characters. If I set it to Norwegian and started another xterm, the
Slovak characters were not working. The cause was that the Slovak
keyboard mapped keys to ISO8859-2 characters, while the Norwegian
keyboard used the ISO8859-1 charset. There are characters in one
charset which are not present in the other. It was not possible to
use the particular keyboard without setting the corresponding
locale. Currently, this means restarting the application with the
correct locale set.
The next problem arises when translating applications. Currently, we
mostly use GNU gettext to do the translation. It is quite nice, but,
in some cases, not sufficient. In many languages, the translation of
a sentence can differ according to the context. Since there is no
context information, it is difficult to make correct translations.
The KDE team solved this issue by putting the context information into
the message identifiers, so it works correctly with a few workarounds,
but that's not a real solution. In English, for example, a noun
differs only in its singular and plural forms (you have "one file" and
"two files"). In Slavic languages, the plural form is often not
regular (in Slovak: "1 súbor", "2, 3, 4 súbory", "5,
... súborov"). This is another issue to be considered when
creating an application (currently, the programmer has to think about
this, but the easy solution would be to create a framework).
The KDE team is developing workarounds for most of the problems I
describe here, but I also want other developers and distribution
manufacturers to be aware of these problems and to try to solve them.
Otherwise, Linux will stay English-centric, and that would be bad for
Linux itself.
Juraj Bednar (http://www.darkie.sk/index.en.php)
is a security consultant and a columnist for a Slovak computer
magazine. He has been a member of the KDE i18n team since the 1.0 release.
He can be reached at bednar@rak.isternet.sk.
T-Shirts and Fame!
We're eager to find people interested in writing editorials on
software-related topics. We're flexible on length, style, and topic,
so long as you know what you're talking about and back up your
opinions with facts. Anyone who writes an editorial gets a freshmeat
t-shirt from ThinkGeek in
addition to 15 minutes of fame. If you think you'd like to try your
hand at it, let jeff.covey@freshmeat.net
know what you'd like to write about.
[Comments are disabled]
|
Referenced projects
gettext - The GNU internationalization library.
GNU C library - The C library used in the GNU system.
|
Comments
[»]
Using gettext to display plurals
by Pavel Roskin - Sep 24th 2002 00:56:19
GNU gettext supports plurals starting with version 0.10.36 (released in
march 2001). Look for function ngettext() in the manual.
[reply]
[top]
[»]
cyrillic
by nikola - Mar 15th 2001 03:42:50
I am very disappointed that I haven't met Linux
distribution supporting Bulgarian. Recently I've
tryed RedHat 7.0, Slackware with kernel 2.2.16,
Mandrike helium ...
None of them succeded in providing writing in
bulgarian, either on console, or in office
applications, not to mention printing.
I'm a kind of angry.
[reply]
[top]
[»]
Linux Interantionalization problems
by Ram Viswanadha - Oct 28th 2000 17:42:04
has anyone checked out ICU?
The article mentions problems with the concept of singular and plurals
nouns in languages... this has already been taken care of use ChoiceFormat
API. Renderding of complex scripts it is available, bidirectional
rendering, word breaking, line breaking is too. Which currently ships with
Debian distros ;-)
Nagari scripts?
I donot agree with the comment that Unicode is inadequate for representing
Nagari scripts. Unicode's Character-Glyph
model address a character as a "character" not the associated glyph. Every
Nagari variant alphabet has a finite number of characters and sounds are
represented by
the use of conjugates/ligatures. Unicode defines a standard
algorithm for rendering Indic scripts and I have not seen a
problem rendering them provided you have a smart layout engine. Taking the
example of Banglore, it will look funny if you look at the hex dump cause
most people are used to English/ASCII form of representation.. but the
important thing to understand out here is that when a ligature has to be
formed between a consonant+consonant+vowel, which is GA+ LA+OO the base
consonant sounds as if there is a virama is attached to it; i.e; it is GG
;the secondary consonant is stressed with the vowel. IMHO complex ligatures
can be adequately represented.
I18N.
Posix locale format is too dumb for localization it has no concept of
inheritance.The latest technical report on locale format for Posix TR14652
is built on the flawed model so I cannot expect it to be any better. I
prefer ICU/Java locale model
[reply]
[top]
[»]
Dont cry, just do it
by Dan Ohnesorg - Oct 24th 2000 11:05:44
I must again say somethink to the comment from Egmont Koblinger. We have
similiar problems, but we have solved many onf them.
You say, there is somethink untranslated or some bad translation of
somethink. But it is not a problem of developers. It is problem of
Hungarisch localization teams. You should help them to make perfect
translations. I knnow it is difficult and I see, that there are match more
programs translated into Czech compared for example to Germany. And there
are 10 mil. Czech over the world and 100 mil. Germans and there arent
people wich would translate somethink into they native language. I think
the problems is that every German can buy Windows for a week of work, but
only few Czech can buy Windows for two months salary, so there is big
interest in thinks wich are cheap. But there are aleso people, which
doesn't find boring to translate somethink in tve evenings. You should find
such a peple in Hungary and organize them. Why should the Hungariens have
fewer translations than Czechs?
We have also two versions of locales, one has 3 letters names of
months (this is very unusual, becouse our names doesn't differ in first
three characters) and second version with longer names. Even the mc can
work with them.
TeX and LateX: we had many problems with this packages, becouse even
Donald Knuth hasn't known all characters, which the Czech language uses.
But we have also very good TeX gurus, which has made csplain and cslatex,
which can use our characters, our hyphenation, our special modes of using
hard spaces etc... Tanhks to the SuSE we have even postsript fonts in our
encodings.
Everithink can be done, but we are the workes, which must done it.
Even if You cannot make patches, You can at least send a letters to the
developers. Sometimes it helps very quickly (yes I have send about 50
letters to netscape and there was nothing better, this is another story),
for example the modlogan supports now czech and another languages and I
have send only ONE e-mail with exact description of the problem a with
sugestion for the developer. There was only one problem, no one has
reported that is has problems with another languages.
Everyone should send letters to developers, becouse now they are
saing, the Czechs wants everytime somethink special...
[reply]
[top]
[»]
Still lot of work to do...
by egmont - Oct 23rd 2000 07:46:19
We must distinguish several kind of users. There are hackers who know
how to set their own LANG and other stuff, but English is usually
right
for them. There are users who can spend several hours in front of
their
computers to find out how to set LANG, and sooner or later they will
do
this. There are system administrators, who want to set a default LANG
for users but make it easily changeable for them. And there are users
who don't know anything about that, they only want the computer to
talk
to them using their native tounge. As Linux becomes more than just a
hackers' OS, the number people belonging to this last group increases
very fast, and programmers must take this to account. We all want
Linux
to be a frienly OS to all those people who are not willing to learn
what
.xsession is and how a simple text editor works, but only click the
mouse and use several big applications such as netscape, staroffice,
etc.
Being a system administrator, upgraded our system to SuSE 7.0
yesterday,
I've played a bit with the LANG=hu_HU setting, I wondered how about
setting it as a default for users. I was very very disappointed.
First, there are only really few applications that can speak
Hungarian.
"mc" is one of these. The main starting screen of
"mc" already contains
a typo. The name of the months and weekdays are written in all
uppercase
characters, which is needless to say, very disgusting. (Okay, I'm not
talking about mc anymore, I'm talking about glibc.) In real
Hungarian,
the names of months and weekdays do not even begin with an uppercase
character (expect of course if it is the beginning of a sentence).
The
abbreviation of several months is 4 or 5 characters, though only the
first 3 of them are present in libc. In case mc stripped those
characters because of its string formatting rules, I'd say it's okay,
but this is not the case, here glibc is incorrect.
My next disappointment was alphabetical sorting, this also has a very
big bug. By the way, if you set LANG=hu_HU in your .bash_profile or
something like this, this will change the sorting to (buggy)
Hungarian
in all the subshells and commands started from this shell, but not
for
this shell, so an "echo *" will still use the default
sorting, but
starting a "bash" manually and typin "echo *"
there will use the
Hungarian sorting. Is there a correct solution for this? There could
be.
On many systems there's a file /etc/environment. This is a very good
thing since the system administrator can set default variables for
the
users without having to edit several different shell's initialization
files. If login or sshd or xdm or anything uses this file, then a
LANG
setting there can solve the problem, but only for those who are
satisfied with the default LANG set by the superuser. A very clear
solution that I think we should implement is the following. For all
the
programs that authenticate a user, such as "login",
"sshd", "su",
"[xkgw]dm", parsing /etc/environment and then ~/.environment
should be
mandatory. A glibc-call for doing this should be written, which can
filter LD_* if the developers think this is a security hole, but I
don't
think so, users may find useful if their login shell is started with a
special LD_PRELOAD variable. When this is done, a nice graphical
interface could be written, which allows the user to enter any
name=value pair, and also allows to choose the value for LANG, TZ and
other special variables from a list. Technically this graphical
application should write ~/.enviromnent, or optionally
/etc/environment
if run by root. (Graphical should mean both X11 and ncurses frontend.)
This method is still not perfect, since KDE wants to change the
language
of the application on the fly, but much better than any distribution
has.
Returning back to the Hungarian libc messages for a short time, libc
error messages (such as "No such file or directory") are not
yet
translated to Hungarian. I wonder why.
We can talk about internationalizing applications, but as long as
these
basic problems mentioned above are not solved, you can't really do
much.
You can't expect that the system administrator will explain to all the
users that if they'd like to change the language, they need to set
LANG
in at least two files, .bash_profile and .xsession, and explain them
how
to use a simple text editor. You can't expect that a user who tries
Linux at home will find it out before giving up. We need to help all
of
these users by providing exactly one, very clearly designed and
implemented way of chosing a language, and this should be a both
X11+ncurses utility, and we must not forget, most of the users do not
care about how this utility works, what files it modifies, they only
need a tool which is 100% perfect.
Let us go to the different fonts, such as Latin-2 used in Hungarian.
Look at all those applications that are developed at US-ASCII or
Latin-1
parts of the world. Look at netscape, when you download a Latin-2
page,
it is diplayed using Latin-1, and after clicking on 'reload' it will
be
displayed correctly for the second time. Look at all those oversized
office packages that cannot handle Latin-2 either on the screen or
when
printing. Look at sgml-tools, which pretends to be one of the best
documentation systems, though is impossible to generate Latin-2 TeX
files. For the developers of sgml-tools, it would need approximately
5
minutes to add a command line option which generates Latin-2 Tex, DVI
and PS files. Now, if I want to generate a Latin-2 PS file, I have to
only create a TeX file using sgml-tools, change the character set in
it
either manually or with a sed script, and give it to latex to compile.
By the way, it took me about an hour to solve it (after severar years
of
experience with Linux and TeX), needed to use strace to find out what
programs sgml-tools launches when creating a PS file and what extra
environment variables it passes to LaTex. Thank you, all the
developers
of sgml-tools.
And look at all those programs developed in Central Europe. Look at
"links" for example, which far the best web browser I've
seen, handles
all the characters correctly. (Okay, "lynx" is very good,
too.)
And what about Latin-2 on the Linux console? If you set a Latin-2
character set, the line drawing characters of "mc" will not
work. Is
there a solution? Yes, there is, some years ago I've spent several
days
creating a character set based on cp437, changed several characters'
layout to the Hungarian accented characters. Yes, you see, this is not
a
Latin-2 character set I've made, it only contains Hungarian letters
correctly. And this is the only character set I know about which
contains all the Hungarian characters and "mc" still draws
nice boxes.
Juraj mentioned the problem of plural form in Slavic languages. In
Hungarian there's a similar situation, "the 4th" and
"the 5th" are
translated to "a 4." and "az 5.", respectively.
Whether to use "a" or
"az" for "the" is the same as "a" or
"an" is used in English. It is
about 2 or 3 lines of C code that can determine whether the
pronounciation of a number begins with a consonant or a vowel. But
even
Hungarian programmers are too lazy to code this, they simply write
"a(z) %d.". This problem really should be solved once and
for all in a library
(either libc or a different new one) for all the languages around the
world. Obviously translating the English message to all the other
languages is a brain damaged idea. The correct solution is to
translate
the message which is in a special abstract language to all the human
languages including English. We need to help the translators using
macros. For example a newly created printf2() call should have a %Naz
macro, where N is a number, for example %3az is replaced by
"a" or "az"
depending on the pronounciation of the number given as the 3rd
argument
to printf2(). Different languages should have different macros
implemented.
There's one more problem with translations. Often translation of a
menu
contains the same shortcut key for two different actions. Try
"mc"
(version 4.5.50) with LANG=hu_HU, try do delete a non empty
directory.
When it asks for comfirmation the second time, the actions
"All" and
"Cancel" both have "M" as the Hungarian shortcut
key. Fortunately
pressing "M" activates "Cancel". If it activated
"All", users could lose
their files only because of the wrong shortcut keys in the
translation.
The same problem appears in many menus in KDE.
So as I see the two most important things to do are to create a
standard
on how to set the LANG variable and other stuff (~/.environment,
etc),
and to desing and implement a framework where it is very easy to
write
a word in plural in Slavic languages, wery easy to make the name of
the
month appear either at the beginning of a sentence or in the middle,
very easy to write correct "a" or "az" in
Hungarian, and so on.
Nowadays translations are usually written by programmers, They very
often make smaller or bigger mistakes, or disgusting translations.
Rather, translations should be make by people who are good in
literature, grammar, and can use computers at basic level. The job of
the programmers is to design a system that can very easily be used by
those who want to translate applications.
[reply]
[top]
[»]
Using two languages
by dmoser - Oct 23rd 2000 01:29:31
I am a native English speaker but I am studying Japanese so I occasionally
want to enter text in Japanese. I want a bilingual environment that
displays application text (menu labels, etc.) in English but allows me to
enter either English or Japanese text. I haven't been able to find a nice
way of doing this. If I set the environment variable LANG=ja_JP.ujis, I
can press shift-space to flip between Japanese-English text entry in GNOME
applications. Unfortunately, this also changes all of the application text
to Japanese which isn't what I want. If I have LANG=en_GB, the application
text is in English but I can't use shift-space to flip between
Japanese-English text entry. Does anyone know how I can get the Japanese
text entry but still have English application text? What do other
bilingual Linux users do?
[reply]
[top]
[»]
Unfriendly ? But what ?
by Rot Weiler - Oct 22nd 2000 18:45:30
I must disagree with one point of the authors article. I am Icelandic, and
so is my mom. My X-sister-in-law is polish. Her daughter is half polish,
half icelandic. The x-sister-in-laws husband is a portuguese. Using a
single setting for LANG/locale would mess up all functionality for at least
half of the users ( they use the same computer ). Instead, I configured
each users .profile and .xsession accordingly to their needs, using lat1
for Icelandic and portuguese, and lat2 for polish. Also, my other
sister-in-law is half yugoslavic, and comes for frequent visits. She reads
slavic, and so she needed her own customization. All accounts are using KDE
as the preferred WM, and have the menus and programs running in the
relevant language. Try doing THAT on Windows :-)
Also, where I work ( the national University ), setting up computers with
only ONE fixed locale would prove fatal, as there are many students from
other countries, even a few chinese ones !
Go linux go ! :-)
[reply]
[top]
[»]
Nagari
by Pierre Abbat - Oct 22nd 2000 17:19:55
Unicode is not an adequate representation for the Nagari scripts, used
to write Indic languages. There are two ways to view a Nagari string: a
sequence of letters and a sequence of glyphs; and Unicode doesn't match
either. For instance, benglora (Bangalore) cannot be written in
Unicode. The G with no vowel does not exist. You can write
benga\lora, with a virama, but it looks funny. (The n is
an anusvara, which does exist in Unicode.)
[reply]
[top]
[»]
I18n is not only translation
by Dan Ohnesorg - Oct 22nd 2000 13:40:38
I disagree with RealRodent, there is not only problem with language. The
problem is also with displaing characters, sorting strings, displaing
currency etc.
We can all learn English, OK, we need it.
You are happy, becouse You are living in a country, where the people
are using ISO-8859-1, which is not in conflict with US-ASCII. But we are
using ISO-8859-2. We are not able write our characters in, at least, 75% of
applications and if we can type then in, we are not able to print them
correctly. The problem exists in so big and wide used programs as Netscape
and Star Office.
This is the biggest problem. The people can learn 100 English words,
which are used in a application. But they cannot use they own language in a
text area, for example.
[reply]
[top]
[»]
What about flexibility?
by Anders Johansson - Oct 22nd 2000 07:18:50
Well, ignorance could twist the fate of Linux. Has everyone forgotten that
flexibilty is one of the major and most important features in the Linux
world. Some may have misinterpreted this as source code being available,
and as everyone using Linux knowing a dozen programming languages will get
me where I want... (Not true)
Linux is slowly transforming from a techie-only OS into a widely used
and abused tool for everyday work. Everyone seems to be focused on getting
the big (anglo-american) corps accepting linux as a viable alternative to
M$, companies who really don't seem to care whether their software will
cost them a zillion dollars or not.
I have to admit that my choice of linux is rather idealogic, but I
favor the thought of openness independant of financial resources, so will
the non-computerized part of the world making the transition into the
computer era. People in these countries have newer been fed the M$
"use it our way or don't use it at all", and the choice of linux
wouldn't be a change of paradigm for these people. But how will they react
when beeing met by the imperialistic attitude that english is the adequate
language when communicating with computers? It shure would piss me off. As
mentioned only 12% of the people in the world handle english.
So I'd like to plead for a change in focus. The linux movement still
is a grassroots movement, which would be nothing without the commitment of
each and every programmer that has contributed to it. Trying to turn the
heads of corporate neoliths would probably do well for the movements self
esteem, but not contribute to the momentum. Contrary, a focus on the
"less developed" part of the world would probably generate a
momentum not yet seen. Think about the amount of potential programmers in
China ;). Unicode would be a sensible, already existing, standard to adopt.
Sitting with my "fullfledged" SuSE 7.0 Professional finding
one(!) editor able to handle Unicode one can't do other but pray.
By the way, Win2k did it... (teaser)
[reply]
[top]
[»]
Icelandic keyboards
by Baldur Gislason - Oct 21st 2000 20:46:58
I've been having problems getting Icelandic characters to display on the
console, in X this is no problem as long as I don't try to use XKB but the
console is a different thing, the only program that wants to work with my
icelandic characters is BitchX (the irc client). I loaded the ISO01 charset
by the way so that's not the problem :/
[reply]
[top]
[»]
Not all country's need them...
by RealRodent - Oct 21st 2000 18:39:55
I think that internationalization is only needed in country's where people
can't speak a word English... For example, I live in the Netherlands, I'm
only 15 and I've got no problems at all with the English in Linux... In
country's like CZech and France the story is different (there are no rules
without exeptions!), internationalization there is needed for Linux to get
a serious place on the desktop market. But we all know the second (and
bigger!) problem with Linux as desktop-OS for the mid-user, so I think this
discussion is a little bit too early...
[reply]
[top]
[»]
Internationalized OSS project
by Hacker_wannabe - Oct 21st 2000 16:30:19
I just wanted to point out that even commercial software has failed at
internationalization. Every single time it is either done as an
after-thought or done poorly using incompatible standards.
With reguard to MS products in Hebrew/Arabic, each generation of
software uses a completely incompatible codepage set (DOS/Win 3.x/95/2K)
and it is possible that it serves as an MS strategy, in the same manner
that closed specification .doc files do.
Furthermore, the internationalized version is by far more buggy, since
it was not finished (feature-wise) before the process of
internationalization began.
I for one believe that OSS is the only chance for true Unicode based
software, taking into account the "Berlin" project and such. Take
the hugh success of Turbo-Linux
in Japan for instance. I bet it has little to do with any distribution
features other than using Japanese in the best way.
If the target of the linux developer community is world domination,
using unicode would make it a viable alternative for every single
non-english application.
[reply]
[top]
[»]
i18n
by Dan Maas - Oct 21st 2000 13:57:50
I think the root of awkward internationalization in Linux is trying to stay
backwards compatible with 1) exisiting fragmentary practices, and 2)
traditional command-line tools.
The "existing practices" I refer to are the incompatible character
sets and input/display methods currently in use around the world. Every
country seems to have its own ASCII superset, along with inflexible
input-output software that is tailored for that one laguage (e.g.
left-to-right vs right-to-left).
Thankfully UNICODE came along and solved the first problem -- but due
to sheer momentum very few projects have adopted it. I am appalled at the
myriad character sets we still need to have fonts for, maintain transcoding
tables for, etc. For a nasty example look no further than Chinese - we have
at least three 100% incompatible encodings for this language, none of which
accomodate any *other* languages, and any one of which would completely
suffice for encoding Chinese (Japanese? Hah, they have their OWN group of
standards!). As a result, displaying Chinese, Japanese, or Korean text on
Linux desktops is a trial in frustration (one that must be repeated for
each character set!). I am doubly disappointed to see the current i18n
efforts (esp. Gtk/GNOME) propagating this horrible mess, rather than
replacing it with UNICODE everywhere (as Microsoft has done already in
Win2k!). Granted UNICODE does not solve all problems - maintaining a
collection of fonts is still non-trivial - but can we at least agree on a
*single* character set?
Owen Taylor's Pango (http://www.pango.org/) will solve the second
problem by providing a "least common denominator" library that attempts to
accomodate display of all directions, encodings, and styles of text. Look
for this to be integrated with Gtk soon. Thank you, Owen.
Nonetheless, I believe the traditional nls/gettext method of
internationalization favored on UNIX is fundamentally not the correct
approach. I claim that i18n does not belong at the level of the C library
or the command line - rather it should be a task of the GUI/windowing
system, much as "themes" are today. Cramming this functionality down to the
C library level just results in bloat where it is least needed (try running
strace() on a modern glibc app -- even 'cat' or 'ps' -- you'll see what I
mean...). Raising i18n to the level of X or your GUI library is not only a
way to trim the fat. I can envision all sorts of neat additional benefits,
such as caching translations at the display server, or firing off requests
to Babelfish when a real translation isn't available. All an application
should have to do is send a UNICODE string to the display. (the current
trend of moving these things client-side makes me sick -- though there's
not much else one can do given the obsolete monster that is X. But that's a
rant for another day =)
All current approaches to i18n, Microsoft and GNU included, are
imperfect. It will be interesting to see which one can prove itself more
useful...
[reply]
[top]
[»]
Linux Internationalization Problems
by nevit - Oct 21st 2000 13:07:03
It is impossible to unagree... altough a lot of work has been done on
internationalization a lot of work remains to be done...
Unmentioned in above article is typing direction which is important in
Hebrew, Arabic, Ottoman, Azerbaijani and Persian language groups who type
from right to left. A smart keyboard is also needed to diffrentiate between
initial, middle and end forms of letters.
Also unmentioned is Printer and Printing problems... Rendering the
fonts on screen is not end of the game... I had difficulties in printing
some documents in Turkish (iso 8859-9) ... altough docs apeared correctly
on screen the print was a problem with some characters.
There are strategical pieces of software developement which if-well
documented- will atract and open the doors for contributions from members
all over the world. The most urgent is documentation.
The following doc's should be prepared to ease international
contribution.
* Freetype font creation. Howto prepare a font needed for a
languange.
* Bidrectional Keyboard definitions / Howto define a Keyboard for X
language
* Unicode, documentation and implementation should also be rushed...
The article addressed an important problem.
[reply]
[top]
[»]
a non-english native language point of view
by bacano - Oct 21st 2000 11:01:28
The major problem about putting linux on desktops, worldwide, is not the
language. Tech people around the world are used to need english from the
begining of any kind of IT studing/lectures since usually translations
sux.
The problem is the international contracts that Microsoft does, and
Linux vendors dont. IMHO it is not a tecnical problem but a commercial
one.
I think my example is a good one, i'm portuguese, i use SuSe in
english version (and SuSe have a portuguese one), i dont have any problems
on my keyboards (like with 'ç' or 'õ' or 'à'), not on
my desktops and not on my laptops. I work in a Microsoft Solution Provider,
and i allready found for linux amazing things, like KBasic, OLAP services,
XML editors, and i use Star Office that is full compliant with MS Office 97
work files. I can connect to NT networks, and i can do work in linux that
can be easelly ported to Windows, and vice-versa. I can use Perl in both
sides, i can use PHP in both sides ... or others things too.
So, where is the problem? i'm not allowed to use linux at work, so the
only things i can/must do are:
- continuing my search about software for linux for prove that
anything can be done here too;
- continuing my efforts to share my (little) knowledge about linux
with my work colleges;
- continuing in annoying my Department Manager for showing him that
almost everything that he does in Windows i cant do it in linux, with my
little money resources, instead of using resources of a billion dolar
corporation.
Why people dont change then? they are affraid of what they dont know,
i dont want that others 'shine' in their little 'monopolies'.
Linux vendors must copy just one thing from Microsoft:
MARKETING&SALES.
But because linux gurus are 'just' tech people, and most of Linux
vendors staff too, the gap is in make worldwide contracts with worldwide
corporations.
Just a little add before ending, i the almost 100 companies that i
give technical support (all in Portugal, with portuguese people) all of
them use Windows, none use Linux, and the ones that use portuguese versions
of Windows or MS Office are around 5-10%, so this is not really a problem
because Linux is 'english native'.
MARKETING, SALES and EDUCATION ... this are the issues that Linux
vendors must work better.
[ ]'s bacano
[reply]
[top]
[»]
Linux Internationalization Problems
by Radek Vybiral - Oct 21st 2000 10:47:55
I agree with this article. For example, Redhat distribution 6.2 CZ is made
especially for Czech users and has more than 90% in stores.
[reply]
[top]
|
 |