Re: html (etc) current status?

From: F J Franklin (F.J.Franklin@sheffield.ac.uk)
Date: Mon Sep 30 2002 - 05:58:54 EDT


On Sun, 29 Sep 2002, Jim Hettmer wrote:
> With 1.0.3 on RH7.3, I have technical documents created with an older
> Netscape, html of course, which I would like to be able to convert over
> to abw. Some small ones do, but normally I get the "bogus" message.
> I've tried tidy -clean -asxhtml and -asxml, but to no avail. The small
> html I created with abiword could be read back in provided I tidied it
> first.
> Also, maybe I'm being simplistic and/or naive, but it seemed to me
> that if abiword would indicate something about what it found fault with,
> I could possibly stumble around in the original and hand-edit the
> offending stuff out?

AbiWord has a native XHTML importer that requires its input to be valid
XML, but it has stricter requirements besides, some of them questionable.
Elements like <div> are used by abiword to indicate sections, but <div>
has a vast range of uses out there in the wild, so abiword gets confused.

If you have a debug build of AbiWord, as I think Hub suggested, then you
may get some useful nuggets of info about wht the importer doesn't like.

> I'd appreciate it if someone could tell me what the current state,
> thoughts, and recommendations are about importing html. Is there hope
> for me, here?.

There is also an HTML importer which doesn't require the file to be XML
and is much more forgiving about tags - unfortunately, the importer is
unfinished and may drop text which is in lists or tables. Also, a lot of
style information is lost.

So, not a great solution.

If you are determined, then I would recommend importing your HTML docs
twice, once using the HTML importer, and once using the text importer.

I'm working my way towards a new, improved [X]HTML <-> ABW converter, but
it's a while away yet.

Regards, Frank

Francis James Franklin
F.J.Franklin@shef.ac.uk

  `Medium atomic weights are available: Gold, Lead, Copper, Jet, Diamond,
Radium, Sapphire, Silver and Steel.
  `Sapphire and Steel have been assigned...'

-----------------------------------------------
To unsubscribe from this list, send a message to
abiword-user-request@abisource.com with the word
unsubscribe in the message body.



This archive was generated by hypermail 2.1.4 : Mon Sep 30 2002 - 06:04:15 EDT