Protege Client Server Tutorial Advanced

From Protege Wiki

Jump to: navigation, search

Protege Server - Advanced Topics

This page describes advanced topics related to the Protege server including:

  • Running the Protege server as a Windows Service
  • Working with Firewalls
  • Configuration Settings in the start-up scripts of the server and client
  • Debug and Performance Monitoring
  • Accessing the Server Programmatically

This page is part of the Protege client-server tutorial.



Contents


How Does RMI Work?

We recommend that you take a look at the How RMI Works wiki page. This will help you better understand and debug RMI related problems.


Running as a Windows Service

In some cases it may be desirable to have the Protege server start as a Windows service. The easiest way to do this is to use the Windows Resource Toolkit which includes the AutoExNt utility. This utility allows Windows batch files to run as a service. Create an Autoexnt.bat file (as per instructions here: http://support.microsoft.com/kb/243486) that changes into your Protege directory and then runs the server. To follow is an example batch file:

@echo off
C:
cd “\Program Files\Protege”
run_protege_server.bat

Once you have completed all the steps in the knowledge base article you will note that the service name is still “AutoExNT”. You can adjust this by using regedit to change the value of the following key to “Protege Server” (the change will not be visible until the server is restarted):

HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\AutoExNT\DisplayName


Working with Firewalls

If the Protege server is running inside a firewall, you need to open two ports to ensure access from outside the firewall. One port is for the rmiregistry and the other port is for the Protege server.

To set the port that the rmiregistry listens on (the default is 1099), the rmiregistry command must be invoked as:

rmiregistry portNumber

For a more specific example, suppose that we use port 5100 for the rmiregistry and we start rmiregistry with the command:

rmiregistry 5100


We will also need to tell the Protege server what port to use to access the rmiregistry. This is done by in one of the following ways:


1. If you ran Protege server using the run_protege_server script, then you need to uncomment the line that starts with PORTOPTS:

In Linux/Mac OSX:

Uncomment the line (remove # at the beginning of line; already done below):

PORTOPTS="-Dprotege.rmi.server.port=5200 -Dprotege.rmi.registry.port=5100"

In Windows:

Uncomment the line (remove "rem" at the beginning of line; already done below):

set "PORTOPTS=-Dprotege.rmi.server.port=5200 -Dprotege.rmi.registry.port=5100"


2. If you ran Protege server using the Advanced method (making commands in the console), then you need to set 2 JVM arguments:

-Dprotege.rmi.registry.port=5100

Suppose we choose port 5200 for the Protege server to listen on. We tell the Protege server to use this port with the following JVM option:

protege.rmi.server.port=5200

An example of these server settings can be found in the <protege-install-dir>/run_protege_server.sh Unix shell script.


Finally, when we log in from a Protege client, we need to tell the client where to find the rmiregistry. This is done by appending :5100 to the hostname in the rmiregistry connection screen:

smi-protege.stanford.edu:5100


The Protege client will determine how to find the Protege server when it contacts the rmiregistry.

Another important problem to solve is NAT (Network Address Translation): the IP address of the Protege server host is not the same inside the local network (for example 192.168.1.xxx) as it is outside the network (for example 9.154.38.47). To overcome this (but often losing the ability to access the Protege server from inside the network), you have to add -Djava.rmi.server.hostname=9.154.38.47 to the JVM start options of the Protege server.

Remember, you often have to restart the rmiregistry for changes to take effect.


Configuration Settings

There are several settings that can be configured in the client-server mode.


Server Side Settings

We have discussed some of the system properties:

-Dprotege.rmi.server.port=...
-Dprotege.rmi.registry.prot=...

... above. There is another setting that can be very important for performance:

 -Dserver.use.compression=true

This option enables compression on the client-server connection. It will increase the cpu usage on the client and the server but can provide quite a significant improvement in bandwidth usage (early experiments suggest a a compression factor of up to 10 to 1 and performance on the client is doubled for some large ontologies).

Finally, the option:

-Dtransaction.level=...

... controls the level of protection associated with transactions. The options are:

  • NONE - which means that transactions do not even have rollback capabilities. This is not recommended.
  • READ_UNCOMMITTED - which means that users can see other users changes even if the other user is in a transaction and has not committed the transaction.
  • READ_COMMITTED - which means that user operations made during a transaction are not seen by other users until they are committed.
  • REPEATABLE_READ - which means that the system assures a user in a transaction that data he reads will not change for the duration of the transaction.
  • SERIALIZABLE - which means that transactions are serializable. This is the most stringent form of transaction processing but it is also the most expensive and the most likely to cause problems with locked databases.


Client Side Settings

There are also optimizations that allow the user to control caching on the client. On the client side the user can specify that certain data on the server will cache information about certain frames from the server side ontology while the client initializes. For example, if the client adds the system properties:

  -Dserver.client.preload0=http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Oncogene_TIM
  -Dserver.client.preload1=http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Diagnostic_Therapeutic_and_Research_Equipment

... then the client will preload Oncogene_TIM and Diagnostic_Therapeutic_and_Research_Equipment and their parents before it starts up. For large ontologies and bad network latency this can allow a user to start editing a selected set of classes without having to wait for the client to get the data from the server. The initialization of these classes is included in the Protege client startup.

The option:

-Dserver.client.preload.skip=true

... disables the pre-caching of frames on the client during the Protege client initialization process. The option:

-Dpreload.frame.limit=...

... limits the number of frames that the client will preload during initilialization.


Debug and Performance Monitoring

Client or Server Side Settings

We have several tools to monitor and debug the performance of the Protege client-server. The logging.properties settings:

java.util.logging.ConsoleHandler.level=FINE

...

edu.stanford.smi.protege.server.socket.MonitoringInputStream.level=FINE
edu.stanford.smi.protege.server.socket.CompressingInputStream.level=FINE

will start a reasonable amount of logging that will give you some idea of what traffic is going across the wire. These settings can be used on either the server or the client and they will measure and report on the low level traffic being received by rmi. The vast majority of the traffic is going from the server to the client so we anticipate that this setting will generally be used on the client. Here is a sampling of the information generated

Average Compression Ratio = 10.412 to 1, Compressed = 0.27 MB, Uncompressed = 2.76 MB (Cumulative)
InputStream 0: 3 megabytes read
Average Compression Ratio = 11.277 to 1, Compressed = 0.33 MB, Uncompressed = 3.77 MB (Cumulative)

The input stream line shows the data read from the socket. The compression ratio lines show the difference between the compressed data that actually goes over the wire and the uncompressed data that is read by the caller.


Client Side Settings

In addition we have tools for simulating real world delays in a localhost setting. The following setting only applies on the client:

-Dserver.delay=80 

Note that in previous versions of Protege the server.delay option was set on the server side. This setting simulates an 80 millisecond delay for the latency associated with each rmi call. This is a very large delay for latency (a single call is noticeable by a Protege user) but experience has shown that this value does appear in real world installations.

In addition, we have the following jvm properties that can be applied on either the server or the client:

-Dserver.upload.kilobytes.second=128 -Dserver.download.kilobytes.second=500

However we anticipate that these settings will be more useful if applied on the client because then a profiler, such as YourKit, can determine how much time is being used for latency and how much time is being used for downloads and uploads. If set on the client the upload time tells the client to simulate delays to mimic a transfer rate for data from the client to the server and the download time tells the client to simulate delats to mimic the transfer rate for data from the server to the client. Note that some tools measure upload and bandwidth in kilobits per second. So a dsl connection with 6 megabits/second download and 512 megabits/second upload is getting 768 Kilobytes/second download and 64 Kilobytes/second upload.


Accessing the Server Programatically

To follow is an example of how to access the remote server programatically:

RemoteProjectManager rpm = RemoteProjectManager.getInstance();
Project p = rpm.getProject("localhost", "Timothy Redmond", "troglodyte", "Newspaper", true);
KnowledgeBase kb = p.getKnowledgeBase();

The code in the box below is another more complicated method that eases the process of opening multiple projects. This approach will allow you to use the same session object to retrieve different remote projects, but you cannot use the same session to open a single project twice.

Project p = null;
try {
    RemoteServer server = (RemoteServer) Naming.lookup("//localhost/" + Server.getBoundName());
    if (server != null) {
        RemoteSession session = server.openSession("Timothy Redmond",
                                                   SystemUtilities.getMachineIpAddress(), 
                                                   "troglodyte");
        if (session != null) {
            RemoteServerProject serverProject = server.openProject("Newspaper", session);
            if (serverProject != null) {
                p = RemoteClientProject.createProject(server, serverProject, session, true);
            }
        }
    }
} catch (Exception e) {
    Log.getLogger().severe(Log.toString(e));
}

Some other methods in the RemoteServer interface that may be of interest are

  • getAvailableProjectNames(RemoteSession)
  • allowsCreateUsers()
  • createUser(String, String)
  • createProject(...)
  • shutdown()

To disconnect from the server, you only need to call the dispose method on the client side project:

    p.dispose();


Running Custom Code on the Server

In Progress. This page is more of a skeleton than the actual needed documentation.

On some occasions, in a client-server situation, it is useful or necessary to run some code on the server rather than on the client. One common reason for this is that an otherwise reasonable block of code may perform very slowly in client-server mode because it makes too many calls over a possibly low latency and bandwidth constrained network. Instead of having the client make several calls to the server, it may make sense to ship the code over to the server and have the code run directly in the server context. An example of a ProtegeJob can be found in the project that can be checked out of svn from here. The actual ProtegeJob itself can be found in MyJob.java.

  1. writing a ProtegeJob is actually pretty easy despite the slighty esoteric issues below. You just need to make a class that has all the data needed for the job and write a run method.
  2. you need to make sure that the ProtegeJob class is found on the server. In your case this might mean that you need to write a plugin for the server that contains the ProtegeJob classes that you are using. Alternatively you could put the ProtegeJob in a jar on the server classpath.
  3. OWLModel's cannot be transferred over the wire. This means that objects that contain an OWLModel, e.g. an OWLClass, need to have their internal OWLModel's restored when they arrive over the wire from the client or from the server. This is done with a localize(KnowledgeBase) method. If you override the ProtegeJob.localize method remember to call super.localize.
  4. To get the OWLModel that you need during the execution of the ProtegeJob use the ((OWlModel) getKnowledgeBase()) line.
  5. Be aware of what data is being sent over the wire. An inline ProtegeJob will pick up data from its enclosing class. It is probably better to put a ProtegeJob in its own class or make it a static class.

Read more about the Protege multi-user support in the Protege client-server tutorial.

See Also

Personal tools