Hi Roland,
this sounds very exciting!
Could you maybe create screenshots of editing chinese documents? If
the other developers agree we could put them on our website do
emphasize abi's capabilities.
Thanks, 
- Rob
On 6/3/05, Roland Kay <roland.kay@ox.compsoc.net> wrote:
> 
> 
> Hi Guys,
> 
> Here are two more RTF importer patches. They should be
> applied in the following order:
> 
>         1, RTF-AltFontName-ver2.patch
>         2, RTF-warnings-2.patch
> 
> The second one is very simple. It just fixes some warnings
> generated by declared but unused variables. One of these was
> introduced by my earlier patch. The other two are in the
> code that processes the \*\abirevision keyword. The code
> looks correct to me with the two unnecessary declarations
> removed. However, it might be an idea for whoever wrote that
> bit of code just to check.
> 
> 
> The first patch is a bit more involved. In Asia Microsoft
> Word exports RTF font tables like this:
> 
> {\fonttbl
>   {\f0\froman\fcharset0\fprq2{\*\panose ...}Times New Roman;}
>   {\f17\fnil\fcharset134\fprq2{\*\panose ...}FZSongTi;}
>   {\f18\fnil\fcharset134\fprq2{\*\panose ...}\'cb\'ce\'cc\'e5{\*\falt SimSun};}
>   {\f19\fnil\fcharset134\fprq2{\*\panose ...}\'cb\'ce\'cc\'e5;}
> }
> 
> NB: I've abbreviated the panose numbers.
> 
> The third entry refers to a font whose name entirely made of
> Chinese characters (it's actually "SongTi") encoded in
> GB2312.
> 
> Without the patch the importer ignores the first "\'" and
> then  mistakes the first "cb" as the font name. It then
> ignores the rest. The result is that any Chinese font like
> this gets named with a two letter hex code, which looks
> pretty silly. Worse, since AbiWord subsequently find no font
> "cb" on the system all the Chinese characters come out as
> circles. The only way to view the document is then to
> "Select All" and choose a sensible font, which mucks up the
> formatting of the document.
> 
> With the patch in place, the importer skips escaped hex
> sequences in the font names. If an alternative font name is
> given and the main font name is blank (either because it was
> really blank, or else because it had no ASCII characters)
> then the importer substitutes the alternative fontname for
> the real one.
> 
> The result is that if the exporting application bothers to
> give alternative fontname, Chinese fonts have sensible
> predictable names. Sadly, while MSWORD gives alternative font
> names for the most common CJK fonts, it doesn't do so for
> all. Thus, in the case of a font which only has a non-ASCII
> name the patch substitutes "UnknownUnicodeFontName".
> 
> 
> A fringe benefit is that the font table parser is more robust
> than before and will correctly handle strange, but
> apparently legal, entries like:
> 
>    {\f20\froman Times New {\*\unknowncommand Fibble!}Roman;}
> 
> If you're running AbiWord on Linux, this doesn't solve the
> problem of Chinese characters being represented as circles
> because the names of the Chinese fonts are different from
> Windows. This, abi can't find "SimSun", in the case of the
> above example, either. I guess this might not be a problem
> on a Windows machine. However, since the Chinese fonts now
> have sensible names we can create a font alias for SimSun.
> Once this is done Chinese documents can be opened and
> displayed immediately. By aliasing UnknownUnicodeFontName
> to the most common font for their region the user can then
> display most of the documents they receive if if the
> exporting app doesn't give an alternative name.
> 
> On Suse, these aliases can be made by adding the following
> to /etc/fonts/local.conf and then running fonts-config.
> 
> <!-- Alias SimSum to FZFangSong so that Chinese docs show up in AbiWord
>      - R.Kay (02-06-05)                                                -->
> <alias>
>  <family>SimSun</family>
>  <prefer>
>   <family>FZSongTi</family>
>  </prefer>
> </alias>
> <alias>
>  <family>SimHei</family>
>  <prefer>
>   <family>FZHeiTi</family>
>  </prefer>
> </alias>
> <alias>
>  <family>UnknownUnicodeFontName</family>
>  <prefer>
>   <family>FZSongTi</family>
>  </prefer>
> </alias>
> 
> 
> Issues outstanding:
> -------------------
> 
> 1,      It would be nice if abi could read the real Chinese
>         font name rather than relying on the alternative
>         name. Modifying the above patch to achieve this is
>         trivial, and in fact I already have code to do this
>         since that was my original intention.
> 
>         Unfortunately, it appears from the code that abi
>         assumes that font names will only contain ASCII
>         characters. For instance, when building lists of
>         character properties the arrays seem to be of
>         type XML_Char which if typedefed to char. I'm afraid
>         that redefining XML_Char as UCS4 will cause problems
>         throughout the program.
> 
>         Would allowing font names to contain arbitrary
>         unicode characters be feasible in 2.6?
> 
> 2,      OpenOffice can identify the font as Chinese and
>         automatically substitute an appropriate alternative
>         without needing any font aliases. Does AbiWord have
>         any such font substituting capability? Does anyone
>         know how OO does this? I gather from some of the
>         commons on bugzilla that this may not be abi's job.
> 
> 
> References:
> -----------
> 
> These bug reports are related to these issues:
> 
> http://bugzilla.abisource.com/show_bug.cgi?id=3312
> http://bugzilla.abisource.com/show_bug.cgi?id=3954
> 
> Best wishes,
> 
> R.
> 
> 
> 
>
Received on Fri Jun  3 14:09:29 2005
This archive was generated by hypermail 2.1.8 : Fri Jun 03 2005 - 14:09:29 CEST