l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
October 7: Social gathering
Next Installfest:
TBD
Latest News:
Aug. 18: Discounts to "Velocity" in NY; come to tonight's "Photography" talk
Page last updated:
2003 Jun 09 17:14

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] vim and utf-8 support (newbie alert)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] vim and utf-8 support (newbie alert)



On Mon, 9 Jun 2003, Peter Jay Salzman wrote:

> right-to-left languages are really, really, really well supported in
> vim.  at least, they seem to be.  check out:
>
>    :set rl

Cool~

> the language i'm thinking of is hebrew, but with some important issues.
>
> 1. i need vowel support.
> 2. i really want to have mixed hebrew/english
>
> i believe taken together, i want to use ISO 10646 which can represent
> all languages at the same time.

Unfortunately I don't know the hebrew language so I don't know what the
difficulties are.  For both Korean and Japanese, we use two-bytes to
represent a single Asian "character", while maintaining backwards
compatibility with ASCII by using the MSb on the first character to flag a
multibyte character.

If Hebrew does the same thing, there is no technical reason why it can't
use both English and Hebrew.  There should even be enough character space
to include vowels.  But I wouldn't know how to access them using the
Hebrew input method.

With your Hebrew terminal, try `cat`-ing any binary file.  If you see any
Hebrew vowels and alphabets, then you just gotta figure out how to type
them using the Hebrew input method.  If not, then the Hebrew encoding
probably isn't compatible with English and can't do vowels.

In that case, your best bet is to use a universal encoding like unicode or
utf-8 (I assume unicode has Hebrew vowels).  But I'm not sure how to do
that so good luck...

> as a first stab at getting utf-8 capable xterms, i set:
>
>    LC_CTYPE=en_US.UTF-8
>
> but wierd things started to happen, like mutt's threading lines turned
> into really strange characters.  i guess the applications themselves
> need to be utf-8 aware too.

UTF-8 is compatible only with the standard ASCII set.  The threading lines
are in the extended ASCII set (it uses the MSb), not the standard ASCII
set.  They clash because UTF-8 uses the MSb to signal multibyte character,
while the extended ASCII set use the MSb.

I recommend just ignoring it (you get used to it).  If not, I think you
can tell Mutt to use standard ASCII for threading lines (using +, -, |,
etc.)

> > Works great under WindowsXP (everything's in unicode; just make sure you
> > got the fonts installed.)
>
> that makes me very sad...   :(

It's one of the reasons I have WindowsXP.  The international language
support is so amazing.  I can read multi-language data file with so much
ease.  I've seen Windows2000 also do a very nice job.

One of these days, Linux needs to be totally UTF-8 based, and port all
software to UTF-8.  But I'm thinking that's gonna require lots of effort
by too many people to happen quickly.

> it totally sucks that mixed hebrew-with-vowels/engish turned out to be
> such a hard thing to do.  :( sucks even worse that it's easy on windows
> xp.   :(

I think it shows more about shortcomings of X than about Windows (although
I'm amazed that MS did such a good job with it).  Something that OSS
community needs to work on...  I think a part of the problem is there
isn't much information available about internationalization, and they
certainly don't teach it in schools or many books.

Maybe we should start ralleying for some standard everyone can work with.
One XIM to handle all languages, one way to display any language, etc.
Put 'em together and we'll have at least some structure people can port
their programs into.  Or maybe there's something already out there that
just needs good documentation.  Whatever the case, I'm tired of working
with all these hacks to get internationalization support working under X
and I bet there are many more people that want better support, too.

Just a thought.

-Mark

-- 
Mark K. Kim
http://www.cbreak.org/
PGP key available upon request.


_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
O'Reilly and Associates
For numerous book donations.