RDF

From AbiWiki

(Difference between revisions)
Jump to: navigation, search
(Direct RDF Wrangling)
 
Line 210: Line 210:
As an advanced topic, one should remember that they can also update the
As an advanced topic, one should remember that they can also update the
RDF of a document by expanding the odt container file and editing the
RDF of a document by expanding the odt container file and editing the
-
manifest.rdf and content.xml manually.
+
manifest.rdf and content.xml manually. Note that this uses the sopranocmd
 +
from KDE4 for actual RDF wrangling.
There are some humans who do not like to eyeball parse RDF/XML files directly.
There are some humans who do not like to eyeball parse RDF/XML files directly.

Current revision as of 04:58, 12 October 2011

Contents

What is RDF: The Resource Description Framework

RDF is a way for information to be represented in a computer such that it is usable to both humans and computers.

RDF provides fine grained and very explicit markup to allow concepts from the real world to be seen unambiguously by a computer. For example, people familar with the English language will read a reference to Tom as very likely being a citation of a particular person. In RDF, using FOAF, a computer will know that the string "Tom" is indeed associated with a person, and will quite possibly know other information about the person such as their phone number, email address, and social network.

The inclusion of RDF in the OpenDocument Format (ODF) expands the way in which people can communicate information; a single file can contain presentation, text content, and now also explicit meaning through RDF. People, Places, Events, Logistics, and more can be easily shared. Word processing software moves beyond the traditional processing of simple strings of letters to performing reasoning on information and helping you manage the things in your life.

The ODF allows RDF to be associated with parts of the document with links that give a word or phrase a unique identifier which the RDF can then link to. This way, if you have two "Tom"s in your document which are actually two people you can have different RDF for each Tom. The computer will understand that even though the document text only contains "Tom" that each of those are different people and know the respective information about each person.


RDF in Abiword: An Overview

Much of the work on RDF in abiword is recent and as such you will need the development branch of abiword to use it.

Abiword can preserve RDF through many file formats; ODF, RTF, and the native abw formats are supported. RDF is also preserved across clipboard actions, including to and from other RDF aware applications such as Calligra.

Creating RDF in an abiword document can be as simple as dragging a contact from the Evolution groupware suite into the document. Of course, you are also free to explicitly associate RDF triples with any part of the document to fulfill your needs.

There is a top level RDF menu in Abiword which allows you to edit the raw RDF triples, execute a query against all the RDF in the document, and highlight passages of text in the document which are explicitly linked to RDF triples.

To connect RDF with a part of the document text, abiword uses RDF Links. An RDF link can be associated with one or more RDF triples.

If you would like to associate some RDF with a word or phrase in the document, select that phrase and choose Insert/RDF Link from the menu. This will ask you for a name for the new link in a manner similar to creating a new bookmark. Once you have a new RDF Link, you can right click on it and choose to edit and query the RDF associated with just that link. Initially the RDF for an RDF Link will be empty.

When you choose RDF/Highlight RDF from the menu all your RDF links will be shown in a different color and underlined. This allows you to easily know where RDF is associated in your document. RDF should not be a hidden element in the document, it is there to help you and as such should be able to be quickly found if desired.

Getting RDF into and out of a document

All of the RDF for a document is available through the RDF/Edit RDF Triples... menu item. The resulting dialog will allow you to create, edit, and delete triples. You can also use the File/Import and Save As... menu items in that dialog to read an entire RDF/XML file into your document or save all the RDF in the document to an RDF/XML file.

Initial support for Drag and Drop from other applications has been added to abiword. Allowing you to drop contact and event information from other applications right into an abiword document and import that information into RDF. As this provides a very simple interface to interacting with RDF the drag and drop will likely be expanded to other data formats and allow dragging items out of abiword in the future.

Another way to get RDF into a document is to select a phrase in the document and choose Insert/RDF Link from the menu. This will ask you for an identifier for the RDF Link which you can then use to associate RDF directly with just that portion of the document. To edit the RDF for an RDF Link, right click on it and choose "Show RDF". Note that this dialog is very similar to the one which allows you to edit the RDF for the entire document, but is restricted to only the RDF which is associated with this RDF Link.

RDF can also be moved into and out of a document using copy and paste as described below.

Query Your RDF

Queries on RDF are commonly expressed in SPARQL. One might like to consider SPARQL as the RDF equivalent of SQL.

You can run a SPARQL query against all the RDF in the document or against only the RDF for an RDF Link. To query all the RDF, choose RDF/SPARQL Query from the menu. To query just the RDF for a particular RDF Link, right click on that link and choose SPARQL Query from the context menu. Note that a default query is provided in both of these cases which will return all the RDF for the document or RDF Link.

Since RDF Links are just one method of associating part of the document with RDF, there is also the RDF/Show RDF for cursor position menu item. If the cursor is on an RDF Link then Show RDF for cursor position does the same thing as selecting Show RDF from the context menu of the RDF Link. If the cursor is in an area of the document which has RDF associated not through an RDF Link then Show RDF for cursor position will generate a SPARQL query to show you just the RDF for where the cursor is located.

RDF Copy and Paste

If you copy and RDF Link and paste it into the same document then the name of the pasted RDF Link will be altered as no two RDF Links can have the same name in a document. Abiword retains the same name prefixed with an "x-" and appended with a generated document wide unique identifier. For example, if you select an RDF Link "foo" and then paste it into the document you might get an RDF Link "x-foo-2324".

If you are pasting an RDF Link from one document into another, the RDF Link name might not be in use already. If the RDF Link name is not in use then it will be preserved as it was. This allows RDF Links to be copy and pasted across documents without altering their names at all.

Note that not only will the RDF Link be copy and pasted but also the actual RDF Triples which are associated with that RDF Link.

RDF in AbiCommand

Abiword includes the [http://www.abisource.com/wiki/AbiCommand AbiCommand] plugin which allows command line access to abiword. There are many abicommands that have been added to allow rich RDF interaction through abicommand. One utility of the command line interface to RDF is the [https://github.com/monkeyiq/odf-2011-track-changes-tests/blob/678de7d9a32213a7ec622325de6b2fe24ddbb871/rdf/RDF.pm Test Suite] which may provide you with some ideas as to how to perform automated actions on the RDF of a document using abiword.

To start using AbiCommand run the following from a console:

abiword --plugin AbiCommand

The load command accepts a file name as its first argument and will load that file into your current session.

AbiWord:> load /tmp/myfile.odt

There are around 30 commands to deal with document RDF in abicommand. You can either work with all the RDF for the current document or a subset of that RDF. If you are working with a subset and wish to work with all the RDF again use the rdf-clear-context-model command. The rdf-set-context-model-pos and rdf-set-context-model-xmlid commands let you work with the subset of RDF which is relevant at a given document position or for the xml:id value you provide. The xml:id you pass to rdf-set-context-model-xmlid is simply the name of the RDF Link from your document. They are the same thing but RDF Link was seen as a simpler concept for the GUI to present to the user.

The xml:id values contained in the document can be queried with rdf-get-all-xmlids. The xml:id values in scope at a given point in the document can be found with rdf-get-xmlids. Going the other way, the scope, start and end of where an xml:id is in effect can be found with rdf-get-xmlid-range, rdf-movept-xmlid-start and rdf-movept-xmlid-end respectively. To create a new RDF Link/xml:id, first create a selection using movept, selectstart and movept again and then use rdf-xmlid-insert xmlid to create an RDF Link with the given xmlid as it's name. And RDF Link can be deleted using rdf-xmlid-delete.

The rdf-import and rdf-export commands let you read an RDF/XML file and import all of its RDF triples into the document. rdf-export saves all the RDF from the document in an RDF/XML file. The rdf-dump command is a lower level command designed for debugging which shows you each RDF triple in a plain text format.

A small group of abicommands allows you to access the individual triples from the RDF. Note that these work with the current context-model, which is either all the RDF for the document or a subset as described above. The rdf-context-show-objects will give you all objects for a given subject,predicate. Likewise, rdf-context-show-subjects will display all the subjects for a given predicate,object pair. The rdf-context-contains will test if a given RDF triple is in the document. For any given subject, rdf-context-show-arcs-out will should you the predicate,object pairs with that subject. To find the count of triples use the rdf-size command.

A SPARQL query can be executed against the document with rdf-execute-sparql. This command allows you to enclose the query in double quotes and have it span over multiple lines.

To change any RDF in abiword, you must create a mutation object. A mutation is designed as a short lived object which allows multiple modifications to be performed on your RDF finishing with a commit or rollback. A mutation is created with rdf-mutation-create, and can be committed with rdf-mutation-commit or rolled back with rdf-mutation-rollback. Before the commit or rollback, triples can be added and removed with rdf-mutation-add and rdf-mutation-remove.

Direct RDF Wrangling

As an advanced topic, one should remember that they can also update the RDF of a document by expanding the odt container file and editing the manifest.rdf and content.xml manually. Note that this uses the sopranocmd from KDE4 for actual RDF wrangling.

There are some humans who do not like to eyeball parse RDF/XML files directly. Being one of these people, I use the following rdf-cat command to print the triples in a manner I find more legible.

 $ cat ~/bin/rdf-cat
 #!/bin/bash
 TMPDIR=print-rdf-`id -u`
 rm -rf /tmp/$TMPDIR 
 TMPDIR=/tmp/$TMPDIR
 mkdir  /$TMPDIR
 SRCDIR=`pwd`
 cd /$TMPDIR
 FMT=${2:-rdfxml}
 sopranocmd --backend redland  --serialization $FMT import "$SRCDIR/$1"
 sopranocmd --backend redland  list


You should be able to expand an ODF file using the zip command

 mkdir expanded
 cd    expanded
 unzip ../document.odt

If there is not manifest.rdf file in the existing odf file you need to ensure META-INF/manifest.xml contains the following among it's file-entry XML elements.

<manifest:file-entry manifest:media-type="application/rdf+xml" manifest:full-path="manifest.rdf"/>

If the ODF file has an existing manifest.rdf file, you can import it into a database file in /tmp/tt with the below commands.

 $ mkdir /tmp/tt
 $ cp    manifest.rdf /tmp/tt
 $ cd /tmp/tt
 $ sopranocmd --backend redland  --serialization rdfxml import manifest.rdf 

If there is no existing manifest.rdf file, then run the following commands to initialize a database in /tmp/tt

 $ mkdir /tmp/tt
 $ cd /tmp/tt
 $ sopranocmd --backend redland list

RDF Triples can be added with the following command

 sopranocmd --backend redland add \
   '<uri:wingb>' '<http://www.w3.org/2002/12/cal/icaltzd#uid>'       'wingb'
 sopranocmd --backend redland add \
   '<uri:wingb>' '<http://docs.oasis-open.org/opendocument/meta/package/common#idref>' "wingb"

and when you are done you can create a new manifest.rdf file with

 sopranocmd --backend redland  --serialization rdfxml export manifest.rdf

Copy the manifest.rdf file to the same location as content.xml from the expanded ODF file. In the above case, this is the "expanded" directory.

To associate some of the RDF with part of the text in content.xml, you might like to add a <text:meta> XML element with xml:id="wingb" somewhere in content.xml. The wingb value is the object referenced by a common#idref predicate above.

You can then create a new ODF file using the following command

 zip -D -X -0 /tmp/product.odt mimetype \
 && find . -name .svn -prune -or -name "*~" -or -name mimetype -prune -or -name Makefile -prune -or -print \
  | zip  -D -X -0  -@ /tmp/product.odt
Retrieved from "http://abiword.com/wiki/RDF"
Personal tools