Protege4ClientServer

From Protege Wiki

Jump to: navigation, search

Contents

Introduction

These are some pages under development to document the Protege 4 Server. We are hoping to release an early alpha soon.

This server is in an early alpha stage and it is recommended that you backup critical files often.

What is it?

The Protege OWL Server provides a platform for collaborative editing and version control of a collection of ontologies. The Protege server tracks changes made to its ontologies, enforces an access control policy for its documents and checks for conflicts between its clients. When used with the Protege client, ontology editors can view and modify a shared ontology in parallel. If a editor chooses, the editor can watch changes made by other editors as they occur. To change an ontology, an editor first makes the changes in his local copy of the ontology. When he is happy with his changes, he can commit them making them available to other editors of the ontology. Alternatively, an editor making changes to his local copy can save his copy of the changes and commit them in a later session.

In addition, the Protege OWL Server can be used as something more like a simple version control system. We are developing a set of command line tools that will be able to use a Protege OWL Server to provide such traditional version control services as checkin, checkout, update, commit and history query commands. The Protege 4 client can be used in this manner as well: an ontology editor can choose not to turn on auto-update and make all his updates and commits manually.

Comparison with the Protege 3 Server

There are several differences between the Protege 3 Server and the Protege 4 Server:

The local copy.
In Protege 3, when a client connect to the server, any change made to the client is immediately reflected on the server. In Protege 4, in contrast, changes only get propagated to the Protege server when the user commits the change. This allows a user of a Protege client to consider his changes before sending the changes to the server. This is a significant enough concept that we describe it in more detail below.
Decoupled client-server.
In Protege 3 when the server goes down or the network is interrupted, the Protege 3 client either freezes or crashes. In contrast, in Protege 4, if the server stops or is inaccessible, the Protege client continues running normally. It is only when some server operation is attempted, such as an update or commit, that the user may become aware that there is a problem communicating with the server.
Commit granularity.
In Protege 3, changes are sent to the server as they are made. In Protege 4 a collection of changes are only committed when the user is ready the user is able to add a commit comment describing the nature of the changes.
Optional automatic update.
In Protege 3, a user sees edits from other users as they occur. In Protege 4, this is optional. This will allow, for instance, a user to start a reasoner and query the ontology state without worrying that the ontology will change as the query is in progress.

The Local Copy/Sandbox

With the Protege 4 client server, when a user checks an ontology out from the server, he gets a separate copy of the server ontology. The user can then modify this copy in any way that he likes and the changes will not go to the server until the user commits the changes.

In fact this local copy can be saved to disk and then even editted with a different editor than Protege before it is committed to the server. Specifically, a user can

  1. start protege and load an ontology from the server,
  2. save the ontology somewhere on the local disk,
  3. exit protege and edit the ontology with a text editor
  4. restart protege and open the ontology from disk
  5. commit the changes which will include the changes made with the text editor.

What happens is that when the file is saved, Protege also saves some files containing the server providing the ontology document, the location of the document on the server and the revision of the ontology document on the server. So if I save an ontology as Thesaurus-redmond.owl in the client.ontologies directory then Protege saves the following files:

  - client.ontologies
       o Thesaurus-redmond.owl
       - .owlserver
            o Thesaurus-redmond.owl.history
            o Thesaurus-redmond.owl.vontology

The Thesaurus-redmond.owl.vontology contains information that describes the relationship between the ontology in Thesaurus-redmond.owl and the document on the server. The Thesaurus-redmond.owl.history contains a local cache of the history of changes made to the ontology document on the server. It is not required - if it is deleted it will be rebuilt - but it provides significant performance advantages for the client especially in the case where either the network is slow or the ontology is large.

Videos

Here are some videos that I am making to demonstrate server features:

  • Protege OWL Client-Server Basics. This video shows how to
    • access a server,
    • upload an ontology,
    • follow changes made by another user with auto-update,
    • have an extended session with the server spanning multiple Protege sessions.
  • Accessing the Protege OWL Server from the command line. This video shows how to use the command line client to
    • browse the Protege OWL server directories and ontologies with the pos-list command.
    • upload ontologies to the Protege OWL server (pos-upload).
    • checkout ontologies from the Protege OWL server (pos-checkout).
    • commit changes back to the Protege OWL server (pos-commit).
    • support the use of ontology editing tools other than Protege to edit a shared ontology from the Protege OWL server.

Large ontologies on a slow network

When a large ontology is being uploaded or downloaded from a server on a slow network, it can take a while to transfer all the necessary data. Unfortunately we have not yet determined how to best monitor and report the progress of this operation so the user doing the upload/download will have little indication of the progress. The good news here is that the user only needs to experience this once for the initial upload of the ontology to the server and once for his initial download of the ontology. In addition, since the upload of an ontology is a one time thing, it is very likely that it can be performed on a faster network.

The issue concerns the change history stored on the server representing the set of changes between revision 0 and revision 1. These changes consist of the full set of changes needed to create the initial version of the ontology. Thus for instance, if the NCI Thesaurus is uploaded onto the server, the set of changes to go from revision 0 (the empty ontology) to revision 1 (the initial version of the Thesaurus on the server) will contain over 1.2 million individual changes. This change set is stored in a 300 MB file which then needs to be transfered to any client that wants a copy of the ontology. (In point of fact, this change set gets compressed before it hits the network so the actual data copied across the wire is only about 44 MB.)

Once the ontology is downloaded to the client, the client can save the ontology with the change history to disk for later reference. When the ontology is reloaded from the disk, the client will already have a copy of the 1.2 million changes from revision 0 to revision 1 and will not need to download it again from the server.

Installation details

Notes

  • We don't yet support the deployment of the server as a service on Windows machines.
  • We assume that Windows computers are installed on the C-drive.
  • On Unix systems we support chkconfig and update-rc.d. Otherwise it is possible that the deploy script will put a working version of the protege start/stop script in the /etc/init.d directory but not complete the deployment.

Prerequisites

For the client installation, the only prerequisite is that you successfully installed Protege. For the server installation you must have installed a copy of Oracle's Java (1.6 or 1.7). Note that Windows systems come with there own version of java but that we have had trouble with this version in some situations. The openjdk java on Linux does work however. Java 1.7 produces prettier logs.

It is also recommended that you create a user account to run the server. Ideally this user account will have minimal access to the system. It will be used to run the Protege OWL Server. On a Linux machine this account can be created with a command like the following:

    adduser --system --home /usr/local/protege.server protege

Client Installation

When the Protege OWL server is released, the latest Protege distribution will include the latest version of the plugins needed to access the Protege OWL Server. In the mean time, to allow Protege to access the server you need to download the following files and copy them to the Protege plugins directory:

  1. the server library,
  2. the Protege client plugin and
  3. the latest owl api (version 3.2.4).

Server Installation

This page describes how to install the Protege OWL server. First download the self-extracting installer. To run the installer, you must run the command

      java -jar owl-server-installer.jar 

with administrator privileges. This is done in a slightly different way for Windows than for the Unix based operating systems (OS X and Linux).

In Windows, you create a command line with administrative privileges with the following steps:

  1. left click on the start button and click on accessories,
  2. right click on the "Command Prompt" icon and click on "run as administrator",
  3. when Windows asks if you want the "Windows Command Processor" to make changes to your computer, answer yes.

You will know when you have done this right because the title bar of the Command Prompt window will say "Administrator: Command Prompt".

OWLServerInstall-AdminsitrativeCommandLine.png

Linux users will probably know what to do. If their system uses sudo, the command can be entered as follows:

    sudo java -jar owl-server-installer.jar

The sudo command will work for mac users also. Alternatively, if the root account has a password associated with it, Unix users can obtain a root prompt with the su command. When you run the installer, you should see a screen such as the following.

OWLServerInstall-Installer.png

There are five fields that need to be filled in (they are not required for the uninstall):

Sandbox User
This is a user that ideally does not have any privileges and does not have access to any sensitive files. The Protege OWL server will run under this user id. If the server was to come under a successful attack, the attacker would have gained access only to this account and would still not have gained meaningful access to the system. This is a standard technique for sandboxing a server and is used by some well known servers (e.g. tomcat). Windows users with domains enabled should remember to specify the domain in this user id (e.g. at Stanford my user id would be win\tredmond meaning the user tredmond in the domain win.)
Hostname
This is the name of your machine in a form that can be accessed by clients. This is a critical field because it needs to be properly resolved by clients. Using an ip address is safe as long as it does not change and the possible ip addresses are calculated by the installer.
Java Command
The java command supplied appears to often be reasonable. It is taken from the java that is being used to run the installer. Windows users want to use the Oracle java rather than the native Windows Java.
Memory in megabytes
The number of megabytes of memory that you want to give to the server.
Automatically start server
It is recommended that for non-windows systems this should be selected. This will also ensure that the server is restarted on a reboot. For windows systems, at the moment, the server has to be started manually. It is easy to uninstall it later if you decide that you don't want the server on your system.

Once these fields are filled in the server can be installed. Note again that the server can be uninstalled without filling in any of these fields.

A detailed description of how the installed protege server is configured on your system can be found here.

Post-Installation

An important configuration file for the running server is the UsersAndGroups file which is located in the "configuration" directory under the data directory of the Protege OWL Server distribution (/var/protege.data/configuration on OS X and Linux and "C:\ProgramData\Protege OWL Server\configuration" on Windows). This file defines the users and their passwords and will probably need to be modified for real server operations. The server only reads this file on restart so it needs to be restarted if this file is changed.

For Unix users, once the server is deployed, it is probably useful to look in the logs directory. At the end of the install the installer reports the location of the logs directory. For Linux and OS X installs the logs are located in "/var/log/protege" and for Windows the logs are located in "C:\ProgramData\Protege OWL Server\logs". The logs from a successful server start should look something like this:

Sun Jan 13 16:25:27 PST 2013-INFO: Server configuration started. 
Sun Jan 13 16:25:27 PST 2013-INFO:     User id: protege 
Sun Jan 13 16:25:27 PST 2013-INFO:     Java: JVM 1.7.0_09-b30 Memory: 745M 
Sun Jan 13 16:25:27 PST 2013-INFO:     Language: en, Country: US 
Sun Jan 13 16:25:27 PST 2013-INFO:     Framework: Apache Software Foundation (1.5) 
Sun Jan 13 16:25:27 PST 2013-INFO:     OS: linux (3.5.0-21-generic) 
Sun Jan 13 16:25:27 PST 2013-INFO:     Processor: x86-64 
Sun Jan 13 16:25:27 PST 2013-INFO: Server configuration found 
Sun Jan 13 16:25:27 PST 2013-INFO: New server component factory: Basic Conflict Manager Factory 
Sun Jan 13 16:25:27 PST 2013-INFO: New server component factory: Plugin Infrastructure Management 
Sun Jan 13 16:25:27 PST 2013-INFO: New server component factory: Policy Components Factory 
Sun Jan 13 16:25:27 PST 2013-INFO: New server component factory: RMI Transport Factory 
Sun Jan 13 16:25:27 PST 2013-INFO: New server component factory: Core Server Factory 
Sun Jan 13 16:25:27 PST 2013-INFO: Server advertised via rmi on port 4875 
Sun Jan 13 16:25:27 PST 2013-INFO: Server exported via rmi on port 4875 
Sun Jan 13 16:25:27 PST 2013-INFO: Authentication service started 
Sun Jan 13 16:25:27 PST 2013-INFO: Basic Conflict Management started. 
Sun Jan 13 16:25:27 PST 2013-INFO: Server started 

(The java 6 version provides some extra redundant lines. One of the advantages of java 7 is that it pays attention to the lines that configure the logger in the logging.properties file.) If there are exceptions this early then it is possible that something went wrong and this may need to be debugged on the p4 mailing lists.

Generally speaking if these messages are found in the logs and the times on the log messages are about right then this means that the server has started correctly. I is still possible to have problems though because, for example, a firewall may be interfering with the server or it may be using the wrong host address.

Windows users can start the server by going to the "C:\Program Files\Protege OWL Server\bin" directory and double-clicking on the run_protege_server.bat file. After starting the server, there should be an empty console window without any messages. The log messages can be found in the log directory as described above. We don't yet support running the server as a windows service - running the server as a daemon is only currently supported on the linux and os x platforms.

Test the installation

The next thing to do is to test the installed server. The ultimate test is to connect to the server from a client such as the Protege client. To do this first make sure that the server libraries are installed on your copy of Protege as described here and use this client to connect to the server as described here. If you can connect from a client on the same machine as the server then the server is running correctly. If this works then you should try connecting from any other machines that you think should be able to connect to rule out possible networking problems such as firewall issues.

Troubleshooting

This section describes what to do if a Protege client cannot connect to the server.

General Checks

First check the server logs as described in the post-installation section. If the server logs are as described there and the server shows a "Server started" message, then it generally means that the server is running correctly.

Second we can check if the server is actually running. On Unix machines (e.g. OS/X and Linux) the ps command can do this:

Neptune:org.protege.owl.server.deploy% ps auxww | grep java
protege  12511  0.1  1.0 3266788 81928 pts/5   Sl   09:20   0:04 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java -Djava.awt.headless=true -Xmx800M -server -Djava.rmi.server.hostname=171.65.32.130 -DentityExpansionLimit=1000000 -Dfile.encoding=UTF-8 -Dorg.protege.owl.server.configuration=metaproject.owl -Djava.util.logging.config.file=logging.properties -classpath lib/felix.jar:lib/ProtegeLauncher.jar org.protege.osgi.framework.Launcher
redmond  14030  0.0  0.0  13616   928 pts/5    S+   10:09   0:00 grep --color=auto java
Neptune:org.protege.owl.server.deploy% 

The process that we are looking for is a java process being run by the sandbox user. In addition one of the things that I look for to be sure that it is the server is the reference to the metaproject.owl file that is hidden in the command. On Windows systems, the task manager presents a limited subset of this data. In this case look for a java process that is being run by the sandbox user.

We can also check to see if the server is responding to requests over the network. Telnet is a useful command for this. Most unicies still have ths command by default (it is a good diagnostic tool but is now rarely used for its original purpose). I think that this tool is also available on Windows. The following checks if the server is receiving requests on localhost:

Neptune:org.protege.owl.server.deploy% telnet localhost 4875
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
^]

telnet> quit
Connection closed.
Neptune:org.protege.owl.server.deploy% 

A similar check can be used to determine if the server is responding to requests using its hostname.

Connection to server times out

In this case the server is running but when the client tries to connect he fails. The error in the Protege error box looks as follows::::

Error 1 Logged at Tue Jan 15 09:23:39 PST 2013
AuthenticationFailedException: Internal failure processing authentication credentials
    org.protege.owl.server.connect.rmi.AbstractRMIClientFactory.connectToServer(AbstractRMIClientFactory.java:108)
    org.protege.owl.server.util.ClientRegistry.connectToServer(ClientRegistry.java:98)

Unfortunately the root cause is not showing. Looking at the Protege logs (${user.home}/.Protege/logs) reveals more information:

Caused by: java.rmi.ConnectException: Connection refused to host: 171.65.32.130; nested exception is: 
	java.net.ConnectException: Connection timed out
	at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
	at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)

The problem here is that, for debugging purposes, I used an ip address (171.65.32.130) as the hostname for the server. A telnet check shows that this address is not working (because as a laptop this network information changed):

Neptune:org.protege.owl.server.deploy% telnet 171.65.32.130 4875
Trying 171.65.32.130...
telnet: Unable to connect to remote host: Connection timed out
Neptune:org.protege.owl.server.deploy% 

While this was a fairly silly reason for the hostname to be wrong, there are other issues that make this a likely issue to come up in practice. For example, perhaps the hostname is not fully qualified or the hostname is a name that works on the host where the server lives but not on other machines on the network. The hostname needs to be carefully chosen so that it will be understood by all the clients that use the server.

Protege OWL Server doesn't start on reboot (Linux)

On Unix systems, if the protege owl server is not properly shut down, there may be a /var/log/protege/PID file that remains behind and prevents the server from properly starting up on the next boot. If this file is removed the server will start on boot again as expected. There may be an issue with this occuring after the first reboot on fedora installations but this is still under investigation.

Accessing the server from the Protege client

To access the server from the Protege client (either the client that is released when the server is released or a client that is patched up as described in configuring the client), click the File menu and then click Open from Protege OWL Server. As a result of this you will get a window such as the following which will allow you to set the host name, the server port (by default it is always 4875), the user name and the password.

OWLServerInstall-OpenServerFileFromProtege.png

After you fill in and click on Connect to server you will have the opportunity to open a file on the server or upload a file to the server.

Saving a local copy on disk

One of the features of the Protege OWL Server is that a user does not need to commit all her changes to the Protege OWL Server in a single Protege (or other ontology editor) session. For example, suppose that a user has some changes that she does not want to commit yet but wants to resume her session the next day. This user can save her file to the hard drive and shut Protege and her system down for the night. When she opens the ontology from the hard drive the next day, she will find that her session has saved all the work from the day before. She still has the same uncommitted changes and Protege will remember the association of the ontology with the server document.

An additional advantage of this capability is that a user can use ontology editing tools that are unaware of the Protege OWL Server. In this use case, the user can use Protege (or the command line utilities) only to update and commit changes to the ontology. I have done several experiments, for example, where I have made changes to an ontology using the emacs text editor. After doing the edits with the text editor, Protege is able to correctly determine what the uncommitted changes are and to commit those changes.

This works because when Protege saves her ontology document it also saves some additional information that indicates the associated server document that is associated with the ontology and the current revision of the saved ontology document. Thus after a save, the directory might look something like the following:

Neptune:client% find . -type f
./catalog-v001.xml                           <-- Protege makes this file and that is a different story
./pizza-redmond.owl                          <-- The saved ontology
./.owlserver/pizza-redmond.owl.vontology     <-- A file describing the server, the server document and the client revision
./.owlserver/pizza-redmond.owl.history       <-- A cache of the server history of changes for the ontology
Neptune:client% 

In fact, I believe that working off these saved files will become the preferred method of working with server documents. Using this approach, a user can, if she so desires, start work on an ontology document without even connecting to the server. The user does not need to go through the process of browsing through the server files to open an ontology and the user gets to choose what ontology editing tools to use. Finally the user does not have to connect to the server until that connection is needed.

Personal tools