RTF export (was: Re: One less for Bonita (patch))

From: Roland Kay <roland.kay_at_ox.compsoc.net>
Date: Wed Apr 13 2005 - 12:45:09 CEST

Hi Dom,

I managed to do some testing with Word and WordPad today. WordPad opens
exported docs with no problems, but Word cannot.

Word cannot even open the simple document:

{\rtf1\ansicpg65001\'e6\'b3\'95\'e6\'96\'87}

which leads me to suspect that it doesn't support UTF-8 rtf files! It's
quite incredible really.

I note from looking at the result of saving in WordPad that
it doesn't make quite as much of a meal of unicode chars as we do.
Rather than "\uc0\uXXXXX" for each char, WordPad just puts a "\uc1" at
the start of the file and then uses "\uXXXXX?" throughout the document.

A convenient solution then might be to use the latin-1 codepage (1252) and
then use \uXXXXX notation for anything that cannot be converted. That
would at least mean that western european scripts wouldn't need any hex
markup. For double byte scripts "\'XX\'XX" and "\uXXXXX?" take up about
the same space in any case.

What do you (and others) think?

Best wishes,

R.
Received on Wed Apr 13 12:55:51 2005

This archive was generated by hypermail 2.1.8 : Wed Apr 13 2005 - 12:55:52 CEST