ProtegeOWL API Basics

From Protege Wiki
Jump to: navigation, search

Main article: ProtegeOWL_API_Programmers_Guide



Basics

Working with OWL Models

The Protege-OWL API is centered around a collection of Java interfaces from the model package. These interfaces provide access to the OWL model and its elements like classes, properties, and individuals. Application developers should not access the implementation of these interfaces (such as DefaultRDFIndividual) directly, but only operate on the interfaces. Using these interfaces you don't have to worry about the internal details of how Protege stores ontologies. Everything is abstracted into interfaces and your code should not make any assumptions about the specific implementation.

The most important model interface is OWLModel, which provides access to the top-level container of the resources in the ontology. You can use OWLModel to create, query, and delete resources of various types and then use the objects returned by the OWLModel to do specific operations. For example, the following snippet creates a new OWLNamedClass (which corresponds to owl:Class in OWL), and then gets its URI:

    OWLModel owlModel = ProtegeOWL.createJenaOWLModel();
    OWLNamedClass worldClass = owlModel.createOWLNamedClass("World");
    System.out.println("Class URI: " + worldClass.getURI());

Note that the class ProtegeOWL provides a couple of convenient static methods to create OWLModels, also from existing OWL files. For example, you can load an existing ontology from the web using

    String uri = "http://www.co-ode.org/ontologies/pizza/2007/02/12/pizza.owl";
    OWLModel owlModel = ProtegeOWL.createJenaOWLModelFromURI(uri);

Names, Namespace prefixes, and URIs

OWL and RDF resources are globally identified by their URIs, such as http://www.owl-ontologies.com/travel.owl#Destination. However, since URIs are long and often inconvenient to handle, the primary access and identification mechanism for ontological resources in the Protege-OWL API is their name. A name is a short form consisting of local name and an optional prefix. Prefixes are typically defined in the ontology to abbreviate names of imported resources. For example, instead of writing http://www.w3.org/2002/07/owl#Class, we can write owl:Class because "owl" is a prefix for the namespace "http://www.w3.org/2002/07/owl#". Similarly, if an ontology imports the travel ontology from above, it can define a prefix to access all imported classes and properties with "travel", e.g. travel:Destination. If we are inside the default namespace of the ontology, the prefix is empty, i.e., the name of the resource is only Destination. Application developers can take control of namespace prefixes using the NamespaceManager object. In order to access the current NamespaceManager of an OWLModel, you can use OWLModel.getNamespaceManager(). Assuming we have loaded the travel ontology as a default namespace, we can access the resources contained in the OWLModel using the following example calls:

    OWLNamedClass destinationClass = owlModel.getOWLNamedClass("Destination");
    OWLObjectProperty hasContactProperty = owlModel.getOWLObjectProperty("hasContact");
    OWLDatatypeProperty hasZipCodeProperty = owlModel.getOWLDatatypeProperty("hasZipCode");
    OWLIndividual sydney = owlModel.getOWLIndividual("Sydney");

... and use the objects to perform further queries or operations on the corresponding resources. For example, in order to extract the URI for a named object, you can use the RDFResource.getURI() method.

Understanding the Model Interfaces

The interfaces of the model package are arranged in an inheritance hierarchy. An overview of the available interfaces can be found in the API diagram, contributed by Matthew Horridge. (You may want to print this diagram and leave it on your desk while you are using the API.) The base interface of all resources is RDFResource, from which subinterfaces for classes, properties, and individuals are derived:

 RDFResource
     RDFSClass
     RDFProperty
     RDFIndividual

RDFResource defines basic operations for all resources, in particular getting and setting property values. RDFProperty is the base interface for rdf:Properties and its subtypes owl:DatatypeProperty and owl:ObjectProperty. For classes, the hierarchy gets quite complex because of the various types of anonymous classes in OWL. This will be handled later.

Here is some example code that creates a simple class, an individual of that class, and then assigns a couple of properties.

    OWLModel owlModel = ProtegeOWL.createJenaOWLModel();

    OWLNamedClass personClass = owlModel.createOWLNamedClass("Person");

    OWLDatatypeProperty ageProperty = owlModel.createOWLDatatypeProperty("age");
    ageProperty.setRange(owlModel.getXSDint());
    ageProperty.setDomain(personClass);

    OWLObjectProperty childrenProperty = owlModel.createOWLObjectProperty("children");
    childrenProperty.setRange(personClass);
    childrenProperty.setDomain(personClass);

    RDFIndividual darwin = personClass.createRDFIndividual("Darwin");
    darwin.setPropertyValue(ageProperty, new Integer(0));

    RDFIndividual holgi = personClass.createRDFIndividual("Holger");
    holgi.setPropertyValue(childrenProperty, darwin);
    holgi.setPropertyValue(ageProperty, new Integer(33));

Creating Named Classes and Individuals

The Protege-OWL API makes a clear distinction between named classes and anonymous classes. Named classes are used to create individuals, while anonymous classes are used to specify logical characteristics (restrictions) of named classes. We will handle anonymous classes later, but let's look at named classes now.

To reflect the OWL specification, there are two types of named classes: RDFSNamedClass (rdfs:Class) and OWLNamedClass (owl:Class). Unless you are explicitly working in RDF, you will most likely create OWL classes. After you have created the classes, you can arrange them in an subclass relationship:

    OWLNamedClass personClass = owlModel.createOWLNamedClass("Person");

    // Create subclass (complicated version)
    OWLNamedClass brotherClass = owlModel.createOWLNamedClass("Brother");
    brotherClass.addSuperclass(personClass);
    brotherClass.removeSuperclass(owlModel.getOWLThingClass());

In this example, the class Brother is created first as a top-level class. The method OWLModel.createOWLNamedClass(String name) makes new classes by default a subclass of owl:Thing only. Then, Person is added to the superclasses, leading to a situation in which both Person and owl:Thing are parents. Therefore, owl:Thing needs to be removed afterwards.

A much more convenient way of creating a subclass of Person is as follows:

   OWLNamedClass sisterClass = owlModel.createOWLNamedSubclass("Sister", personClass);

The resulting inheritance hierarchy of the code above is:

   Person
       Brother
       Sister

A simple recursive call can then be used to print arbitrary hierarchies with indentation:

        printClassTree(personClass, "");
    }

    private static void printClassTree(RDFSClass cls, String indentation) {
        System.out.println(indentation + cls.getName());
        for (Iterator it = cls.getSubclasses(false).iterator(); it.hasNext();) {
            RDFSClass subclass = (RDFSClass) it.next();
            printClassTree(subclass, indentation + "    ");
        }
    }

In the recursive routine, RDFSClass.getSubclasses() takes a boolean argument. When set to true, this will return not only the direct subclasses of the current class, but also the subclasses of the subclasses, etc.

Named classes can be used to generate individuals. The instances of a class can then be queried using the RDFSClass.getInstances() method:

    OWLIndividual individual = brotherClass.createOWLIndividual("Hans");
    Collection brothers = brotherClass.getInstances(false);
    assert (brothers.contains(hans));
    assert (brothers.size() == 1);

There is a crucial distinction between "direct" and "indirect" instances. The individual Hans is a direct instance of Brother, but not a direct instance of Person. However, it is also an indirect instance of Person, because Brother is a subclass of Person, i.e., every Brother is also a Person. Programmers are able to select whether their calls shall address only the direct or also the indirect instances using a boolean parameter:

    assert (personClass.getInstanceCount(false) == 0);
    assert (personClass.getInstanceCount(true) == 0);
    assert (personClass.getInstances(true).contains(hans));

The inverse query to get the type/class of an individual can be made using the RDFResource.getRDFType() family of methods. For example, Hans has the rdf:type Brother, and it also has the (indirect) type Person:

   assert (hans.getRDFType().equals(brotherClass));
   assert (hans.hasRDFType(brotherClass));
   assert !(hans.hasRDFType(personClass, false));
   assert (hans.hasRDFType(personClass, true));

If the life cycle of a resource is over, it can be deleted using the RDFResource.delete() method. The API uses the same convention as other APIs, where "delete" means to completely destroy the object, whereas "remove" only deletes references to the object. This means that when you call delete on a resource, then all its property values are removed but not deleted.

   hans.delete();

Using Datatype Properties and Datatype Values

To create an owl:DatatypeProperty, you can use OWLModel.createOWLDatatypeProperty(String name). By default, datatype properties can take any datatype value such as strings and integers. OWL defines several XML Schema datatypes that can be used to restrict the range of properties. The most popular XML Schema datatypes are xsd:string, xsd:int, xsd:float, and xsd:boolean. For example, if you want to limit your property to only take string values, you can restrict its range using:

    OWLDatatypeProperty property = owlModel.createOWLDatatypeProperty("name");
    name.setRange(owlModel.getXSDstring());

... where the call OWLModel.getXSDstring() returns a reference to the RDFSDatatype xsd:string. Other default datatypes are accessible using similar OWLModel.getXSD... methods. More complex datatypes can be accessed using their URI:

   RDFSDatatype dateType = owlModel.getRDFSDatatypeByName("xsd:date");

For the default datatypes, property values are conveniently handled using corresponding Java data types. For example, if you assign property values for a string property, you can simply pass a String object to the setPropertyValue call. Corresponding mappings exist into other data types:

    individual.setPropertyValue(stringProperty, "MyString");
    individual.setPropertyValue(intProperty, new Integer(42));
    individual.setPropertyValue(floatProperty, new Float(4.2));
    individual.setPropertyValue(booleanProperty, Boolean.TRUE);

The inverse getter methods will also deliver the objects in their simplest possible forms:

    String stringValue = (String) individual.getPropertyValue(stringProperty);
    Integer intValue = (Integer) individual.getPropertyValue(intProperty);
    Float float = (Float) individual.getPropertyValue(floatProperty);
    Boolean boolean = (Boolean) individual.getPropertyValue(booleanProperty);

Values of all other data types are wrapped into objects of the class RDFSLiteral. A literal combines a value together with its datatype. Values are stored as strings and need to be unwrapped by the user code. The following example is used to assign a value of the XML Schema datatype xsd:date.

    RDFSDatatype xsdDate = owlModel.getRDFSDatatypeByName("xsd:date");
    OWLDatatypeProperty dateProperty = owlModel.createOWLDatatypeProperty("dateProperty", xsdDate);
    RDFSLiteral dateLiteral = owlModel.createRDFSLiteral("1971-07-06", xsdDate);
    individual.setPropertyValue(dateProperty, dateLiteral);
    RDFSLiteral myDate = (RDFSLiteral) individual.getPropertyValue(dateProperty);
    System.out.println("Date: " + myDate);

... will print out "Date: 1971-07-06".

RDFSLiterals are also used to bundle string values together with a language tag:

    RDFSLiteral langLiteral = owlModel.createRDFSLiteral("Wert", "de");
    individual.setPropertyValue(stringProperty, langLiteral);
    RDFSLiteral result = (RDFSLiteral) individual.getPropertyValue(stringProperty);
    assert (result.getLanguage().equals("de"));
    assert (result.getString().equals("Wert"));

To summarize, datatype values are handled either as primitive objects (String, Integer, Float, Boolean), or RDFSLiterals. If we have a literal of a default data type, then this is automatically simplified. In some cases, it may be more convenient to always have RDFSLiterals, in particular if user's code has to execute on arbitrary data types. For these cases, the Protégé-OWL API provides a number of convenience methods that are guaranteed to return RDFSLiterals, e.g., OWLModel.asRDFSLiteral().

Using Object Properties to Build Relationships between Resources

owl:ObjectProperties are used to represent properties to establish relations between individuals. The following snippet create a new object property "children" that can take Persons as values:

    OWLNamedClass personClass = owlModel.createOWLNamedClass("Person");
    OWLObjectProperty childrenProperty = owlModel.createOWLObjectProperty("children");
    childrenProperty.setRange(personClass);

Then, the API can be used to assign property values for individuals:

    RDFIndividual darwin = personClass.createRDFIndividual("Darwin");
    RDFIndividual holgi = personClass.createRDFIndividual("Holger");
    holgi.setPropertyValue(childrenProperty, darwin);

The API also has various methods to add or remove values:

    holgi.addPropertyValue(childrenProperty, other);
    holgi.removePropertyValue(childrenProperty, other);

Let us now assume there is also a class Animal, and you want to specify that the range of the children property is either Animals or Persons. In OWL you need to create an owl:unionOf class to declare such ranges, because declaring both classes as rdfs:ranges would mean that only objects that are at the same time Persons and Animals are valid for the property. Therefore, the correct OWL declaration should look like the following:

    <owl:Class rdf:ID="Person"/>
    <owl:Class rdf:ID="Animal"/>
    <owl:ObjectProperty rdf:ID="children">
      <rdfs:range>
        <owl:Class>
          <owl:unionOf rdf:parseType="Collection">
            <owl:Class rdf:about="#Person"/>
            <owl:Class rdf:about="#Animal"/>
          </owl:unionOf>
        </owl:Class>
      </rdfs:range>
    </owl:ObjectProperty>

While it is rather complicated to deal with unions manually, the Protege-OWL API makes it very simple to create union ranges on the fly:

    OWLNamedClass personClass = owlModel.createOWLNamedClass("Person");
    OWLNamedClass animalClass = owlModel.createOWLNamedClass("Animal");
    OWLObjectProperty childrenProperty = owlModel.createOWLObjectProperty("children");
    childrenProperty.addUnionRangeClass(personClass);
    childrenProperty.addUnionRangeClass(animalClass);

Object properties can also be declared to have other characterstics, e.g., they can be transitive. In the following code snippet, the property ancestor is declared to be transitive because if A is an ancestor of B, and B is an ancestor of C, then A is also an ancestor of C.

    OWLObjectProperty ancestorProperty = owlModel.createOWLObjectProperty("ancestor");
    ancestorProperty.setRange(personClass);
    ancestorProperty.setTransitive(true);

Similar methods exist for making a property symmetric or functional.

Working with References to External/Untyped Resources

In many cases, resources in an ontology have references to other Web resources that are not specified as OWL resources. For example, you may want to define a link from an individual to an image. OWL/RDF documents can contain arbitrary links into other URIs, such as the hasImage property value below.

    <rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:owl="http://www.w3.org/2002/07/owl#"
      xmlns="http://www.owl-ontologies.com/unnamed.owl#"
      xml:base="http://www.owl-ontologies.com/unnamed.owl">
      <owl:Ontology rdf:about=""/>
      <owl:Class rdf:ID="Person"/>
      <rdf:Property rdf:ID="hasImage"/>
      <Person rdf:ID="Darwin">
        <hasImage rdf:resource="http://www.knublauch.com/darwin/Darwin-Feeding-Smiling.jpg"/>
      </Person>
    </rdf:RDF>

The Protege-OWL API supports such links by means of the RDFUntypedResource class. Untyped resources are individuals that have no rdf:type statement. Since the concept of namespace prefixes does not consistently apply to untyped resources, instances of the RDFUntypedResource class have the full URI as their name. The following code snippet creates the example ontology shown above:

    JenaOWLModel owlModel = ProtegeOWL.createJenaOWLModel();

    OWLNamedClass personClass = owlModel.createOWLNamedClass("Person");
    OWLIndividual individual = personClass.createOWLIndividual("Darwin");
    RDFProperty hasImageProperty = owlModel.createRDFProperty("hasImage");

    String uri = "http://www.knublauch.com/darwin/Darwin-Feeding-Smiling.jpg";
    RDFUntypedResource image = owlModel.createRDFUntypedResource(uri);
    individual.addPropertyValue(hasImageProperty, image);

    Jena.dumpRDF(owlModel.getOntModel());

Property Domains

The domain of a property specifies the types of resources that can take values for the property. In the most trivial case, a property is domainless, i.e., it does not have any rdfs:domain statement in the OWL file. This is logically equivalent to having a domain consisting of only the class owl:Thing, because every class is derived from owl:Thing. The default domain of a new property in Protégé is null, meaning that it is domainless.

The API provides several methods in the class RDFProperty to set and query the domain. The following code snippet creates a new property and puts the class Person into its domain:

    OWLNamedClass personClass = owlModel.createOWLNamedClass("Person");
    OWLObjectProperty childrenProperty = owlModel.createOWLObjectProperty("children");
    childrenProperty.setDomain(personClass);

In principle, properties can have multiple domain definitions. Similar to range statements, having multiple domains means that the property can be applied to the intersection of the domain classes. This seldom reflects the modeler's intention, and the much more common case is to declare the domain to be a union of various classes, implemented by means of an anonymous owl:unionOf class. In the Protege-OWL API, these union classes can be created automatically using the following call:

    OWLNamedClass animalClass = owlModel.createOWLNamedClass("Animal");
    childrenProperty.addUnionDomainClass(animalClass);

Resulting in the following OWL code:

    <owl:Class rdf:ID="Person"/>
    <owl:Class rdf:ID="Animal"/>
    <owl:ObjectProperty rdf:ID="children">
      <rdfs:domain>
        <owl:Class>
          <owl:unionOf rdf:parseType="Collection">
            <owl:Class rdf:about="#Person"/>
            <owl:Class rdf:about="#Animal"/>
          </owl:unionOf>
        </owl:Class>
      </rdfs:domain>
    </owl:ObjectProperty>

The handling of domains is a bit more complicated if you have subproperty hierarchies. If the subproperty does not have it's own domain, then it inherits the domain of its superproperties. For example, if you have a property sons, which is a subproperty of children, and you leave the domain of sons unspecified, then it consists of Person and Animal as well:

    OWLObjectProperty sonsProperty = owlModel.createOWLObjectProperty("sons");
    sonsProperty.addSuperproperty(childrenProperty);
    assert (sonsProperty.getDomain(false) == null);
    assert (sonsProperty.getDomain(true) instanceof OWLUnionClass);

Here, the RDFProperty.getDomain() method takes a boolean flag to distinguish between the direct domain of the property and its possibly inherited domain. In the example above, the domain consists of an OWLUnionClass (which will be handled in detail below), to indicate that either Person or Animal are valid domains. Since this is a common pattern, the following convenience method RDFProperty.getUnionDomain() resolves union domains into a handy collection of classes:

    Collection unionDomain = sonsProperty.getUnionDomain(true);
    assert (unionDomain.contains(personClass));
    assert (unionDomain.contains(animalClass));