From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Mon May 06 2002 - 10:45:23 EDT
--- Hubert Figuiere <hfiguiere@teaser.fr> wrote: > On
dim, 2002-05-05 at 17:56, Tomas Frydrych wrote:
> >
> > I have committed the changes toward 32-bit
> internal representation
> > of Unicode and removed the lock from the src
> directory. These
> > changes cover only the main module XP, win32 and
> gtk code and
> > the wordperfect importer. I will leave the other
> platforms and plugins
> > for others to do, see the notes below.
> >
>
> So from now on, I'll no longer back port stuff to
> STABLE branch... But
> the reverse, I will to some extent.
Just keep it stable (:
> > Summary of the changes
> > -------------------------------------
> > There are three new types now: UT_UCS4Char,
> UT_UCS2Char and
> > UT_GrowBufElement. There is a new string class
> UT_UCS4String,
> > and new sets of UT_UCS4_ and UT_UCS2_ string
> functions
> > replacing the UT_UCS_ functions. All internal
> Unicode processing
> > should be done using the UT_UCS4Char and
> functions. I have left
> > the UT_UCSChar type in place for the time being,
> as an equivalent
> > of the new UT_UCS4Char type; this is a temporary
> measure that is
> > meant to make the transition easier and once we
> are done we will
> > do a global replace and remove UT_UCSChar from the
> ut_type.h
> > file. Consequently, all new code should only use
> UT_UCS4Char.
> >
> > Notes on transferring the remaining code:
> > (1) Replace any UT_UCS_ calls with UT_UCS4_ or
> UT_UCS2_ as
> > appropriate; replace any UT_UCS2String instances
> with
> > UT_UCS4String, where appropriate. Outside of
> impexp code and
> > the input methods and platform specific text
> drawing calls this can
> > be done blindly; in these special case more care
> is needed.
>
> Is there a way to translate UCS2 to UCS4 easily ?
> Because sometime I get
> UCS strings from Cocoa and have to pass them as
> UCS4....
If it really is UCS-2 yes. You pad it with zeros.
Assuming the endian is native, of course.
If it's UTF-16, but telling you it's UCS-2, like
Windows does, then surrogages will get broken and
very difficult to track down bugs will appear.
The other easy way is to just use iconv. There should
be functions to convert from a UCS2String to a
UCS4String and vice versa. If there isn't they are
pretty easy to implement using iconv. It might
require memory allocation though so you'll have to
be prepared to handle the case of it failing.
Oh, and find out if Cocoa is really using UCS-2 or
UTF-16!
Andrew Dunbar.
> Hub
>
>
> ATTACHMENT part 2 application/pgp-signature
name=signature.asc
=====
http://linguaphile.sourceforge.net http://www.abisource.com
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
This archive was generated by hypermail 2.1.4 : Mon May 06 2002 - 10:48:00 EDT