How Owl 2.0 Imports Work

From Protege Wiki
Revision as of 09:41, January 10, 2009 by Tredmond (talk | contribs) (Shareable Imports)

Jump to: navigation, search

OWL 2.0 Imports

My understanding is that OWL 2.0 imports are based on an import by location scheme. In section 3.2 of the Structural Specification and Functional-Style Syntax, the notion of using an IRI to access an ontology document is introduced:

  • "Each ontology document can be accessed from an IRI by means of an appropriate protocol."
  • "Each ontology document can be converted in some well-defined way into an ontology (i.e., into an instance of the Ontology class from the structural specification)."

This notion of access has some provisions the idea that tools may redirect access to an ontology to a different location:

"OWL 2 tools will often need to implement functionality such as caching or off-line processing, where ontology documents may be stored at addresses different from the ones dictated by their ontology IRIs and version IRIs. OWL 2 tools may implement a redirection mechanism: when a tool is used to access an ontology document at IRI I, the tool may redirect I to a different IRI DI and access the ontology document from there instead. The result of accessing the ontology document from DI must be the same as if the ontology were accessed from I."

The important part of this quote is the last line where it is indicated that the result of a redirection should be the same as the result that would be obtained from using the IO-scheme indicated by the IRI. Thus the results of performing an IO operation is the final arbiter of the intended meaning of the import.

This scheme essentially views imports as IO-directives. This is a very simple approach to ontology imports when the IO operations behave the same for all users. In these days of a highly reliable and accessible internet this assumption will often hold. Most import directives point to the internet and these directives are usually easily resolved. However there are a variety of situations where the IO-directive based approach will not work very well:

  • a user is offline for a period. Even in these days there are situations where users do not have reliable access to the internet.
  • a application cannot trust the IO operations specified in the imports directives. In particular, many applications must be able to perform even when the internet is not available.
  • the IO-mechanism indicated by the imports directive is protected by security mechanisms such as a firewall.
  • the IO-mechanism indicated by the imports directive is only applicable in a particular runtime environment. For instance, increasingly users are developing ontologies that are accessible when some local implementation of web container (e.g. tomcat) or agent based environment is running.

Each of these situations creates a challenge for users or developers who want to share ontologies.

In addition it is becoming increasingly common to import an ontology using an IRI that cannot be found in the ontology being imported. For example, in a recent ontology, an import statement used the IO address "http://purl.org/obo/owl/OBO_REL" to refer to an ontology called "http://purl.org/obo/owl/relationship". When this is combined with the fact that import trees are becoming increasingly complex, this can create an awkward problem for offline users to predict the intent of ontologies.

What follows is a series of possible approaches that might be taken to mitigate these difficulties. There are no tools yet that include these workarounds and it is not clear what mechanisms will actually be used.

Shareable Imports

There is a case where the OWL 2.0 specifications suggest how imports should work even when the IO mechanisms are not available. So in particular, if an ontology document, O, has an ontology IRI, v, and no ontology version IRI then a directive of the form "import v" probably means import the ontology document O. Similarly, if an ontology document, O, has an ontology version IRI, v, then a directive of the form "import v" probably means import the ontology document O. Finally - modulo issues of getting the wrong version of an ontology - if an ontology document, O, has an ontology IRI, v then a directive of the form "import v" probably means import the ontology document O. I will call import declarations that follow this discipline shareable imports because they encourage sharing of ontologies between different users and different environments.

The advantage of these heuristics is that when they are applicable they restore all the advantages of the OWL 1.0 import by name scheme. The meaning of the import directives can be determined by looking at the content of the ontologies alone. Thus when offline, a user or tool can determine the import tree for a collection of ontologies simply by reading the ontology documents. There is no need for a tool-specific representation of the imports graph that is separate from the ontology content. When possible, it would seem that this scheme for importing ontologies is desirable. Ontology development tools will probably supply tools that will convert import statements into this format.

These heuristics are based on the following requirements from the OWL 2.0 specifications:

  • "If O contains an ontology IRI OI but no version IRI, then the ontology document of O should be accessible from the IRI OI."
  • "If D contains an ontology IRI OI and a version IRI VI, then the ontology document of O should be accessible from the IRI VI; furthermore, if O is the current version of the ontology series with the IRI OI, then the ontology document of O should also be accessible from the IRI OI."

We believe that most of the time ontology developers will be able to live by these conditions. However these conditions are often impossible to meet. One problem is that when organizations change, the locations of the ontologies cannot be maintained. Even the w3c group has not been able to meet these requirements (e.g. where is the ontology with the name http://www.w3.org/2003/11/swrl?). In addition, purl sites - intended to correct these types of problems - are turning out to be primary source of ontologies that cannot be found by their name.

So the major disadvantage of shareable imports is that often it cannot be applied while staying true to the OWL 2.0 specifications. In those cases where ontologies cannot be found by their name or their version name, the shareable imports approach is not applicable.

Tool Specific Repository Mechanisms

In cases where a user or application is offline and the meaning of import declarations cannot be determined, an ontology tool will need to include a table indicating how to redirect the imports. Take the case where a tool is offline and is trying to resolve an import from the location http://purl.org/obo/owl/OBO_REL. Suppose also that

  1. there is no available ontology with a name or version name of http://purl.org/obo/owl/OBO_REL.
  2. there is an available ontology with the name http://purl.org/obo/owl/relationship.

In this case, the ontology tool must have a mapping from the location http://purl.org/obo/owl/OBO_REL to the file containing the ontology with the name http://purl.org/obo/owl/relationship. These mappings can be held in a tool specific file.

There are a couple of cases where this technique works extremely well. Suppose that a user is using a tool to access ontologies and is online. He wants to prepare for offline mode. The tool can download the needed ontologies to the users disk. As the download occurs the tool can record which IO locations correspond to which files that have been written on disk. Then later when the user is offline, the tool can use its map of IO location to file location map to redirect import declarations.

This technique will probably also be used by developers of ontology tools that hide the ontology from the user. For example, a tool that uses an ontology to diagnose the cause of an illness will be used by doctors. These doctors may have no interest in or knowledge of the underlying ontology. One of the steps during the process of building this tool will be the construction of the io redirection map.

This approach has two disadvantages. First, it is tool specific. An IO redirection map is present for the OWL API (or Protege 4) will not be useable by a user of Jena (or TopBraid). Hopefully users will either have tools to convert these maps or will be able to manually reconstruct the map from one tool from the map for another tool. Second there will be scenarios where this information needs to be manually inserted by the user.

Modification of Import Statements

There may be some cases where it makes sense for a tool to change the import directives before exporting or saving an ontlogy. For example, an ontology repository could change the import statements in its exported ontologies to point back to the repository. The advantage of this approach is that the modified ontology has the desired import behavior. The disadvantage is the the content of the ontology is changed to obtain this behavior. In the case of the ontology repository, there could be a separate mechanism for accessing the ontology that would contain the original imports.

Avoiding Duplicate Imports

In some cases, ontologies in an import closure will import the same ontology using different import statements.