Roland Kay wrote:
> 
> OK. Things have got a bit complicated so I'm attaching all
> the necessary patches to this email. They should apply in
> any order but on my system I apply them in this order:
> 
> 1,	RTF-AsianFontNames.patch  (new)
> 2,	RTF-warnings-2.patch      (same as previous post)
> 3,	XML-Props.patch           (same as previous post)
> 
> 
> No. 1 is the finalised RTF Asian font names patch. This
> reads the escaped hex multi-byte font names used in Asia and
> stores them as UTF-8 in the document. Thus, Chinese users get
> to see the font names in Chinese characters if this was how
> they were encoded in the document. Also, with appropriate
> font installation or font aliasing, all valid documents can
> be displayed without all the characters turning into
> circles.
> 
> I've gone back to using UT_String and not UT_UTF8String
> because the use of UT_UTF8String string was corrupting the
> Chinese font names. The reason is as follows:
> 
> In China MSWord exports RTF with the font names encoded in
> the GB charset. I read these in one character at a time into
> a UT_String. Once the entire font name has been read I
> convert the string to UTF8 and hand it to
> RTFFontTableItem(). Thus, the UT_String never holds UTF8. IN
> fact, it holds the font name in the native character set.
> Trying to append the (8 bit) GB characters to a UT_UTF8String
> causes them to become corrupted.
> 
> It seems much more sense to me to read the entire string in
> and convert it in one go, rather than trying to convert it
> one character at a time.
Then that is a job of UT_ByteBuf, not for UT_String. Because we just 
store arbitrary bytes into a buffer.
Hub
Received on Sun Jun  5 17:18:11 2005
This archive was generated by hypermail 2.1.8 : Sun Jun 05 2005 - 17:18:11 CEST