How Owl 2.0 Imports Work

From Protege Wiki
Revision as of 11:44, November 30, 2009 by Tredmond (talk | contribs) (Building XML Catalogs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

OWL 2.0 Imports

Under Construction!


Motivation

In OWL 2, imports are handled differently than they are in OWL 1.0. There have been two main changes

  • added support for versions of an ontology
  • using import by location rather than import by name.

The motivation for the first of these changes is pretty clear. OWL 2.0 supports versions by allowing an ontology can have two IRI's in its name. The first IRI is the ontology IRI. The second name is the version IRI for the ontology. In many cases the version IRI will be null. But when the version IRI is not null, this will mean that the ontology is a specific version of the ontology.

Thus for example, I might have an ontology that I am working on which I call

   http://www.tigraworld.com/protege/determinants.owl.

After a while I start needing versions of this ontology, so I create an ontology with an ontology IRI

   http://www.tigraworld.com/protege/determinants.owl

and a version IRI

   http://www.tigraworld.com/protege/determinants-1.0.owl.

A later published verion of this ontology might have the version IRI

   http://www.tigraworld.com/protege/determinants-2.0.owl.

The scheme by which these versions are named is not defined by the OWL 2.0 specification.

The intent is that these IRI's can be used to look up an ontology. If an ontology has a version IRI then following the version IRI using specified protocol should retrieve the ontology with that version. Thus version 1.0 of the determinants ontology can be found at the web location

   http://www.tigraworld.com/protege/determinants-1.0.owl.

Following the ontology IRI, e.g.

   http://www.tigraworld.com/protege/determinants.owl.

should retrieve the latest version of that ontology (which may or may not have a version IRI). However, it should be noted that the reason OWL 2.0 uses import by name is that it is often the case that ontologies cannot be retrieved by name.

When importing, these two names allow ontology developers to specify which version of an ontology they want to import. The can specify a version of an ontology by importing the ontology version IRI. They can specify the latest version of an ontology by importing the ontology IRI.

The second change to OWL 2.0 imports is the main subject of this note. OWL 2.0 uses an import by location scheme rather than the import by name scheme used in OWL 1.0. This simply means that an import declaration is a directive to import the ontology that can be found at the physical location represented by the imported IRI. The reason that OWL 2.0 changed to import by location is that in many cases ontologies cannot be found by name. This meant that many owl ontologies could not use the import by name scheme to do their imports because then there would be no way for applications or users to find the imported ontology. With import by location, the importing ontology always states where the imported ontology can be found.

Offline Editing and XML Catalogs

The disadvantage of the import by location scheme is that it adds a bit of complexity when a user wants to download some ontologies from the internet and either edit them on the hard drive or work with them while offline. To make this concrete suppose that there are two ontologies on the internet which are located on the web at the location

    http://www.tigraworld.com/protege/determinants.owl

and

    http://www.tigraworld.com/protege/continuedFractions.owl.

Suppose that the determinants.owl ontology imports the continuedFractions.owl ontology with the following import declaration

    import http://www.tigraworld.com/protege/continuedFractions.owl.

If the user downloads these ontologies to his disk and invokes an ontology editing tool on the determinants.owl ontology, the the ontology editing tool will naturally import the continuedFractions.owl ontology from its web site at

    http://www.tigraworld.com/protege/continuedFractions.owl.

If the user wants to the import of continuedFractions.owl to redirect to the version of the continuedFractions.owl ontology on the users local disk, the user needs to use XML Catalogs. XML Catalogs allow the user to specify that the process of resolving the URL

    import http://www.tigraworld.com/protege/continuedFractions.owl.

be redirected to a specific location on the local drive. For users who are familiar with Protege 3.4 ontology repositories, the XML catalog will play a very similar role as the .repository files in Protege 3.4. The big advantage of XML catalogs is that they are a standard mechanism that can be used by any tool that understands OWL.

Thus XML Catalogs will become an essential part of sharing ontologies. It is therefore important that tools support a variety of mechanisms for generating XML Catalogs.

Building XML Catalogs

In this section, we will consider the problem of automatically generating XML catalogs. Assuming that a set of files on disk have been downloaded from the internet to a disk, the question we want to ask is

Where can these files be found on the internet?

Obviously in general this question cannot be answered. However, ontologies contain a couple of pointers that are supposed to point to where they can be found, specifically the xml base, the ontology version and the ontology name. Note that the preferred approach when users are sharing owl ontologies through e-mail is that they will provide a xml catalog with the owl files that they share. This would make the automatic generation of the xml catalog unnecessary.

Generating XML Catalogs During download

This is the most accurate way of building an xml catalog. The other methods described here are heuristics and as such are optional and can be overridden by a user. I will describe this process with a simple example.

Suppose a user want to download the ontology

    http://www.tigraworld.com/protege/determinants.owl

and its imports to disk. As part of this download process, Protege will build an xml catalog which reflects where each of the imports was found. So when the Protege tool processes the import statement

    import http://www.tigraworld.com/protege/continuedFractions.owl

it will convert

    http://www.tigraworld.com/protege/continuedFractions.owl

to a url and download what it finds into a file (probably continuedFractions.owl) on the hard disk. When it does this it can add an entry into the xml catalog reflecting that an import of

    http://www.tigraworld.com/protege/continuedFractions.owl

should be redirected to the continuedFractions.owl file on the disk.

To be fully robust, this algorithm will have to be a little bit more complicated than this. For example, we have seen ontologies where the same import is imported using multiple distinct uri's. This means that the download algorithm would need to detect duplicates and do the right thing both in terms of how it saves the files and how it updates the xml catalog.

Using XML Base

This algorithm is a heuristic and can be turned off or overridden. This is the recommended algorithm because it is fast and it generally returns the information that the user needs.

In the specification of xml base, it is stated that the xml base should represent the location where a file can be found. Thus if the continuedFractions.owl file is found on disk (in rdf/xml format), it is very likely that its xml base will be

    http://www.tigraworld.com/protege/continuedFractions.owl.

This would suggest that an import statement of the form

   import     http://www.tigraworld.com/protege/continuedFractions.owl

can safely be redirected to the continuedFractions.owl file on disk and the xml catalog can be updated accordingly.


Using the Ontology IRI or Version IRI

If an ontology has a version IRI, then it should be the case that this ontology can be retrieved by turning the version IRI into a URL. Thus if I have a file on disk called determinants.owl which has an ontology IRI of

   http://www.tigraworld.com/protege/determinants.owl

and a version IRI of

   http://www.tigraworld.com/protege/determinants-1.0.owl

then this version of the ontology should be found at the location

   http://www.tigraworld.com/protege/determinants-1.0.owl.

This means that if I have a file on disk called determinants.owl which has an ontology IRI of

   http://www.tigraworld.com/protege/determinants.owl

and a version IRI of

   http://www.tigraworld.com/protege/determinants-1.0.owl

then I can guess that the imports directive of

   import http://www.tigraworld.com/protege/determinants-1.0.owl

can be safely redirected to the determinants.owl file on the disk.

Similarly, if an ontology has a name but no version IRI then it should be possible to find the ontology using the name. Both cases described so far should work and are pretty safe heuristics.