Subject: Re: announce of patch to support CJK in AW
From: hj (huangj@citiz.net)
Date: Sun Oct 29 2000 - 01:13:30 CST
It's the best that all non-English country save unicode in *.abw better than
mbs. We'll display characters with several language in one document in the
future.
----- Original Message -----
发件人: Vlad Harchev <hvv@hippo.ru>
收件人: <abiword-dev@abisource.com>
抄送: Belcon <rainfall@yeah.net>; hj <huangj@citiz.net>; Chih-Wei Huang
<cwhuang@linux.org.tw>
发送时间: 2000年10月28日 3:33
主题: announce of patch to support CJK in AW
> Here is a location of the patch:
>     http://www.hippo.ru/~hvv/abiword/aw-cjk.diff.gz
>
> This patch can be cleanly applied over vanilla 0.7.11 patched with the
> following patches (I hope they are still there):
>     ftp://seviorpc.ph.unimelb.edu.au/pub/abi-oct24-cvs.patch.gz
>     ftp://seviorpc.ph.unimelb.edu.au/pub/wv-oct24-cvs.patch.gz
>
> What's there:
> 100% of the HJ's patch logic is there. The code and logic was greatly
cleaned
> up compared to original patch and should be compilable (didn't test) on
any
> platform (HJ's patch was making use of glib in xp code). Also, the logic
of
> HJ's is disabled if current locale is not CJK one. That was extensively
tested
> with Dom's german document and (in full, all aspects) with russian.
>
> The added functionality:
> * UT_Wctomb and UT_Mbtowc are used for converting between various charsets
>  from now. They use iconv internally (so now they became working, and
portable
>  and also allow to chose input/output encoding). All usage of iconv
everywhere
>  (except wv) correctly swaps bytes of UCS (correct order is detected at
>  runtime).
>
> * Thanks to HJ, AW emits only necessary fonts to the .ps when printing. It
>  reduced the size of .ps file generated by AW by 5 times for one
font-enriched
>  document of me (title for some paper).
>
> * Fixed bug with spellchecker ("replace" button didn't work).
>
> * Now AW looks for more files kinda
${prefix}//AbiSuite/AbiWord/system.profile
>   - also ${prefix}/AbiSuite/AbiWord/system.profile-${SUFFIX}, for the
following
>   values of ${SUFFIX}:
>       'language', 'charset', 'language-Country',
'language-Country.charset'
>   This allows to ship language-specific defaults (e.g. metric system or
>   name of spellchecker dictionary).
>
> * As for fonts, AW now tries to load fonts from the following
subdirectories
>   of ${prefix}/AbiSuite/fonts
>       'language', 'charset', 'language-Country',
'language-Country.charset'
>   This should solve CJK's people problems (before this patch, AW was
looking
>   only in subdirectory 'charset').
>     Fonts of 'fonts.dir' format should be placed in them. Under CJK
locales,
>   the file with list of fonts is also named 'fonts.dir', but it has the
same
>   extended format as HJ's 'fonts.hj' has. Consider this when trying.
>
> * If GNOME_XML2 is undefined UnknownEncodingHandler is set on XML parser
>   in ie_imp_XML.cpp
>
> * Some translator's names added to CREDITS.TXT
>
> * Just to underline: support for "wrap-at-any-CJK-letter" logic of layout
>   is already there too (thanks to HJ).
>
>    I think that this patch can be committed since it doesn't break
anything
> under non-CJK locale (at least if you did 'make clean' after applying).
>
>    I ask CJK people to test the following, in the order of precedence:
> * Input of CJK letters in various charsets. It should work. Insure twice
that
>   gtk+ is installed correctly, that fonts are in the font path, etc.
>   Currently you have to have all CJK fonts AW uses available in fontpath
>   before the start of AbiWord wrapper (it's not yet updated to look into
all
>   subdirectories AW looks now).
> * Cutting and pasting immediately. If this doesn't work, then test RTF
importers
>   and exporters (AW uses them internally for cut/paste). Very minor tweaks
>   would be needed to make it working if it doesn't work. If
cutting/pasting
>   works, try exporting/importing of RTF and testing it with other apps
(e.g.
>   word).
> * Cutting/pasting to/from other apps and saving/reading plain text files.
>   It should just work.
> * Saving and loading of native AW file format. Should work. If import of
.abw
>   doesn't work, then try removing "encoding=FOO" from the 1st line of it.
> * Printing. Since 100% of HJ's logic is there, it should just work.
> * Export to html. Should work. If it doesn't tell what changes are needed
to
>   make browsers understanding it (keep in mind that xhtml importer should
be
>   able to read produced file).
> * Checking that export of CJK texts to LaTeX works if correct prologue is
>   added to exported document. That prologue should be added to the tables
in
>   xap_EncodingManager.cpp
> * import of CJK doc files (most probably it won't work due to wv's
singlebyte
>   encoding limitations). wv should be hacked to allow importing .doc
files.
>
>  What won't work with CJK text:
> * export to WML, DocBook. I just don't know how to specify charset name in
>     these formats.
> * import of XHTML (html importer assumes UTF8) and DocBook.
> * export to Word. It doesn't work for Latin1 yet, so forget it.
> * No other-than-unix specific code is touch, so CJK support is in the same
>     state on platforms other than unix.
> * Spellchecking of CJK texts. Does it ever makes sense? English words can
be
>  spellchecked inside CJK text.
>
>  Donations/fees are appreciated.
>  If anybody needs it, I can provide commercial support and extension of
this
>  work.
>
>  Enjoy.
>
>  I'm going to bed, so I won't be able to read/post/hack in next 11 hours.
>
>  Best regards,
>   -Vlad
>
>
This archive was generated by hypermail 2b25 : Sun Oct 29 2000 - 01:16:27 CST