Difference between revisions of "Protege Client-Server Tutorial"

From Protege Wiki
Jump to: navigation, search
Line 387: Line 387:
 
YourKit can determine how much time is being used for latency and how much time is being used for downloads and uploads.  If set on the client the upload time tells the client to simulate delays to mimic a transfer rate for data from the client to the server and the download time tells the client to simulate delats to mimic the transfer rate for data from the server to the client.
 
YourKit can determine how much time is being used for latency and how much time is being used for downloads and uploads.  If set on the client the upload time tells the client to simulate delays to mimic a transfer rate for data from the client to the server and the download time tells the client to simulate delats to mimic the transfer rate for data from the server to the client.
 
Note that some tools measure upload and bandwidth in kilobits per second.  So a dsl connection with 6 megabits/second download and 512 megabits/second upload is getting 768 Kilobytes/second download and  64 Kilobytes/second upload.
 
Note that some tools measure upload and bandwidth in kilobits per second.  So a dsl connection with 6 megabits/second download and 512 megabits/second upload is getting 768 Kilobytes/second download and  64 Kilobytes/second upload.
 +
 +
 +
== Resources ==
 +
 +
You may take a look at [[Collaborative_Protege|Collaborative Protege]] that provides support for collaboration (e.g., change tracking, annotations, discussion threads on ontology entities, chat, etc.). Collaborative Protege works in client-server mode with different ontology languages (OWL, RDF, Frames) and for different Protege backends (files or databases).

Revision as of 13:56, December 18, 2008

Protege Client-Server Tutorial

This tutorial explains how to configure and run Protege in client-server mode and is intended for experienced Protege users.

Please note the following limitations of the client-server capabilities:

  • The security system described in the "configuring the server" section is not implemented. The classes exist in the "projects project" but the underlying implementation is absent in the application. The only available security is via the user accounts and passwords.
  • Forms edited by clients are not propagated to the server (or to other clients).


Starting and Testing the Server

Start the Server

There are two ways of starting the Protege server: one using the run server scripts available in the Protege installation directory - the easier and recommended way, and one using two steps by typing commands in a console - that is the more advanced way.

Please start by trying out the first solution (the recommended one), and only if that fails, try the second one.

Starting the Protege server using the run_protege_server script (Recommended)

Try this method first.

In the Protege installation directory there is a script called run_protege_server.bat on Windows and run_protege_server.sh on Linux and MacOS. This script will start a background process called "rmiregistry", which is part of the standard Java Runtime Environment. The script will then start the Protege server by invoking a Java program (if you want to see the details about the command, look in the advanced method for starting the server).

If you have installed Protege with a Java VM included, then all you have to do to start the server, is to start a console window (on Windows, Start menu -> Run -> cmd), and run the run_protege_server script.

Note If you don't know whether you have installed Protege with a Java VM included or not, then look into the Protege installation folder. If you see a jre subfolder, then you have installed Protege with a Java VM included. If you do not see the jre folder, then you have not installed Protege with a Java VM included.

If you have Protege installed with an included Java VM, then you're done with this step. You only need run in a console the run_protege_server script.

If you have not installed Protege with an included Java VM, then you need to adjust the path to the Java VM in the run_protege_server script. To do that, you need to edit the run_protege_script in a text editor.


Windows

In run_protege_server.bat, edit the line:

set JDKBIN=jre\bin

to match your path to the JRE bin folder. For example, a common place would be in the Program files folder. If you don't know whether you have Java installed, or where, ask your system administrator. Or, even easier, just download the Protege installer with a Java VM included, and you don't have to go through this configuration. A typical example of where Java is installed on Windows is this:

set JDKBIN="C:\Program Files\Java\jre1.5.0_08\bin"


Linux and MacOS

The run_protege_server.sh script uses the JAVA_HOME environment variable to figure out where to find Java. If JAVA_HOME in your setup already points to the Java VM installation directory, then you don't need to do anything. If it doesn't, then please edit the run_protege_server.sh in a text editor, and add at the beginning of the file (e.g. after the commented line containing "Where is Java?") a line setting the JAVA_HOME to the path to Java on your system. An example from a Ubuntu installation is:

JAVA_HOME=/usr/lib/jvm/java-1.5.0-sun-1.5.0.16/bin


OK, so now you are ready to launch the Protege server. In a console, change the directory to the Protege installation directory, and type run_protege_server.bat on Windows, and run_protege_server.sh on Linux and MacOS. You should see something like this:


StartProtegeServer.png


Congratulations! You have started the server. If you have trouble starting the server, please see the troubleshooting section from below. The next steps are to configure on the server the projects (see below the configuration section). If you are running the server behind a firewall, there are some additional configuration that you must do to ensure that the clients can connect to the server even behind the firewall (See Firewall section from below).


Starting the Protege server in two steps using console commands (Advanced)

Try this method only if the recommended way of starting the server has failed. See above. Skip this section if you have already succeeded in starting the server

There are two processes that need to be started. In the root directory of your Protege installation, we have provided both an example Windows batch file ("run_protege_server.bat"), and an example Unix shell script ("run_protege_server.sh") to launch these processes. It is important to be aware of what the batch files are doing, so we suggest starting the processes manually at least once.

The first process to kick off is a background process called "rmiregistry", which is part of the standard Java Runtime Environment. If you downloaded the version of Protege that includes a Java Virtual Machine, you will find the rmiregistry executable in your jre/bin directory. If you installed Protege without a VM, you will need to locate and run the rmiregistry executable from the directory where you installed your VM (no parameters are necessary).

To follow are the steps for running the rmiregistry executable:

  • Launch a console window.
  • Change the working directory to the Protege installation directory.
  • If you are using Windows, type "start /min jre\bin\rmiregistry".
  • If you are using Unix, type "jre/bin/rmiregistry &".

Note: If you installed Protege without a VM and your Java installation is located in a path with spaces (such as "C:\Program Files\Java\jre1.5.0_08\bin\rmiregistry.exe") the start command for the RMI Registry in the "run_protege_server.bat" will fail. To fix this, change the command to:

start "rmiregistry" /min "C:\Program Files\Java\jre1.5.0_08\bin\rmiregistry.exe"

The rmiregistry process will either start with no messages, or will crash saying that the port is already in use. The latter result probably means that you are already running rmiregistry. Either of these results is fine. You only need one version of rmiregistry running on your machine since all applications can use it.

The second process to kick off is the Protege server process, using the following command (please note that this is a single command):

Windows:

jre\bin\java -Xmx200M -cp protege.jar;looks-2.1.3.jar;unicode_panel.jar -Djava.rmi.server.codebase=file:/c:/program%20files/protege_3.3.1/protege.jar edu.stanford.smi.protege.server.Server examples\server\metaproject.pprj

Unix:

jre/bin/java -Xmx200M -cp protege.jar:looks-2.1.3.jar:unicode_panel.jar -Djava.rmi.server.codebase=file:/home/rwf/protege_3.1/protege.jar edu.stanford.smi.protege.server.Server examples/server/metaproject.pprj

The meaning of the parameters will be explained in the Configuring the Server section. The intent of this section is simply to get you going as quickly as possible. The set of parameters listed above assumes that you have installed Protege in the default location with the default directory name. On Unix, you will have to change the path of the codebase parameter to point to the proper location of the Protege JAR file (note that this path must be "absolute" from the root - no shortcuts!).

The Protege server should start up with the normal console window output (the build number, the version of the JVM used, a list of installed plug-ins, etc.). In addition, you should see the following output:

Available Projects:
	Newspaper
	Wines
Protege server ready to accept connections...

To follow is a screenshot of what this looks like on a Windows machine:

ClientServerTutorial start-the-server.jpg

Congratulations! You now have the server running.

Please note the following:

  • On Unix, the metaproject unfortunately refers to other Protege projects using relative paths containing Windows "\" characters rather than Unix "/" characters. Please edit the metaproject (with Protege) to correct these characters before starting the server, otherwise it will complain about projects that can't be found.
  • If you start the server twice there are no error messages. The second instance takes over accepting new client connections, while the first continues working with existing client connections, if any.


Troubleshooting Protege server starting problems

"The system can not find the path specified" - If you have messages saying that the system cannot find the path specified, or other similar messages, it means that the Java VM was not found on the path as specified in the run_protege_server script, or as you have specified in the command in the console (depending on what method you used to start the Protege server). If you have installed Protege with an included Java VM (the easiest way), then you should not get this message. If you have not installed Protege with an included Java VM, then check in run_protege_server.bat or .sh that you have specified the right path to the local Java installation. See the recommended way for starting the Protege server for examples.


What backend to use on the Protege server

Warning: We recommend the use of database projects with the Protege server. However, if for some reason you want to use a file-based project (like those in the example above), you will need to tell the server to periodically save the project to disk. If you do not specify a save interval, any changes you make will be lost if the server goes down.

You can tell the Protege server to periodically save your project.

If you have use the run_protege_server method to start your server, then edit the run_protege_server script and uncomment the line that contains the option for auto saving of projects.

Windows

You don't need to do anything, because the auto-save is enabled by default with a 2 minutes period of time:

set SAVE_INTERVAL=-saveIntervalSec=120

If you don't want this option, you can add a rem in front of this line.

Linux

Remove the comment symbol # from the line:

SAVE_INTERVAL=-saveIntervalSec=120


If you have started the server using the Advanced method, then you need to add an extra argument to the server start command:

...server.Server -saveIntervalSec=N ...

where N is the number of seconds between saves. A reasonable number for N is perhaps 120 (every two minutes). If your project is large you may want to chose a larger number since the system will be unavailable while the save is taking place. Note that saves are only performed when a project changes.


Connect a Client to the Server

On the same machine, start Protege by double-clicking on the executable. From the "Welcome to Protege" dialog, click the "Open Existing Project..." button, which brings up the "Open Project" dialog. If you have configured Protege not to show the welcome dialog, you can bring up the Open Project dialog by choosing the File | Open Project... menu item. In the Open Project dialog, click on the Server button in the lower-left corner. To follow is a screenshot of the dialog after clicking on the Server button:

ClientServerTutorial open-server-project.jpg

Leave the default information in the dialog (User Name = "Guest", etc.) and press OK. You will then see the "Select Project" dialog, which shows you the available projects (Newspaper and Wines) to choose from:

ClientServerTutorial select-server-project.jpg

Select the "Newspaper" example and press OK.

In a moment the Newspaper project will load and be available in the client. Changes that you make in the client are actually being propagated to the server (although this is difficult to see with just one client running!).

Connect a Second Client to the Server

You can run a second client either on the same machine or on a different machine. If you choose to connect from the same machine, just follow the instructions given above. If you choose to connect from a different machine, you need to enter the DNS name for the server machine in the Project | Open Project dialog in the "Host Machine Name" text box (in place of the default value of "localhost"). The DNS name will be something like "yourservermachine.yourdomain.com".

Note that you must use the TCP/IP name (DNS name) for the machine and not the "Windows Networking" name for the machine, even if both of your client and server machines are running Windows. If you don't know your server machine's DNS name you may need to get someone at your site to help you figure it out. If your server machine does not have a DNS name then you cannot connect to it from another machine using Protege.

When you make changes in either of the clients, the changes will immediately be reflected in the other. Try it!

Configuring the Server

This section describes how to configure the projects that are available from the Protege server. (Read this section only if you have succeeded in performing the steps from the previous section Starting and Testing the Server.

The Metaproject

The metaproject, located in the examples\server subdirectory by default, contains information about which Protege projects are exported and which users have access to these projects. Note that the built-in security concerning which users can access which projects is on top of whatever other security your system provides, e.g. a firewall.

Use Protege to open the metaproject and spend some time browsing the class heirarchy. You will find a very simple ontology of users, security, and projects:

ClientServerTutorial metaproject.jpg

Instances of the Project class will be made available to people identified with instances of the User class. The security model represented by the metaproject ontology is essentially equivalent to the security model of the Unix file system. Permissions are divided into "read" and "write" access for users categorized into "owner", "group", and "world". Every project has exactly one owner and users may be a member of any number of groups. "World" is a group that has everyone as a member. The Unix security model is extended a bit in the sense that individual users can be given specific access to a project.

If you examine instances of the User class, you will find the default "Guest" user. For the security conscious, your first task should be to delete the "Guest" user, and any other default users. (Before doing this, you may want to try creating some new users and ensure that they can successfully connect).

If you examine instances of the Project class, you will encounter the Newspaper and Wines projects. Note that these are just references to projects that exist on your disk as part of the default Protege installation. Also note that the specified file locations are relative to the Protege installation directory (actually the current working directory) rather than to the location of the metaproject. If you find this confusing, you can always specify the file locations as absolute paths. On a Window machine, for example, you can specify the absolute path as C:\\MyProjectDiretory\\kbs\\MyProtegeKB.pprj or C:/MyProject/kbs/MyProtegeKB.pprj. If you have other projects that you have created and you want to export them in the client-server version, you should create instances of these projects in the metaproject. (We recommend making a copy of the metaproject first, just in case!). Remember to configure the security for your projects since by default, only the owner has access to a particular project.

After editing the metaproject, save it and restart the Protege server. There is currently no way for the server to read an updated metaproject (we may provide this in the future). You should now be able to see the results of your changes, such as additional projects and users, when you connect from a new client.

Create new users from the client

By default, the server will disallow the creation of new users in the "Login to Server" panel when clicking on the "New user" button. To allow the creation of new users by clicking on the "New user" button on the client, add to the protege.properties file the following line:

server.allow.create.users=true

After editing the protege.properties, you need to restart the Protege server, for the changes to take effect.

Advanced Topics

Running as a Windows Service

In some cases it may be desirable to have the Protege server start as a Windows service. The easiest way to do this is to use the Windows Resource Toolkit which includes the AutoExNt utility. This utility allows Windows batch files to run as a service. Create an Autoexnt.bat file (as per instructions here: http://support.microsoft.com/kb/243486) that changes into your Protege directory and then runs the server. To follow is an example batch file:

@echo off
C:
cd “\Program Files\Protege”
run_protege_server.bat

Once you have completed all the steps in the knowledge base article you will note that the service name is still “AutoExNT”. You can adjust this by using regedit to change the value of the following key to “Protege Server” (the change will not be visible until the server is restarted):

HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\AutoExNT\DisplayName


Working with Firewalls

If the Protege server is running inside a firewall, you need to open two ports to ensure access from outside the firewall. One port is for the rmiregistry and the other port is for the Protege server.

To set the port that the rmiregistry listens on (the default is 1099), the rmiregistry command must be invoked as:

rmiregistry portno.

For a more specific example, suppose that we use port 5100 for the rmiregistry and we start rmiregistry with the command:

rmiregistry 5100

We will also need to tell the Protege server what port to use to access the rmiregistry. This is done by adding the following JVM option to the command that starts the server:

-Dprotege.rmi.registry.port=5100.

Next, suppose we choose port 5200 for the Protege server to listen on. We tell the Protege server to use this port with the following JVM option:

protege.rmi.server.port=5200.

An example of these server settings can be found in the <protege-install-dir>/run_protege_server.sh Unix shell script.

Finally, when we log in from a Protege client, we need to tell the client where to find the rmiregistry. This is done by appending :5100 to the hostname in the rmiregistry connection screen:

elliptic.dyndns.org:5100.

The Protege client will determine how to find the Protege server when it contacts the rmiregistry.

Another important problem to solve is NAT (Network Address Translation): the IP address of the Protege server host is not the same inside the local network (for example 192.168.1.xxx) as it is outside the network (for example 9.154.38.47). To overcome this (but often losing the ability to access the Protege server from inside the network), you have to add -Djava.rmi.server.hostname=9.154.38.47 to the JVM start options of the Protege server.

Remember, you often have to restart the rmiregistry for changes to take effect.

Accessing the Server Programatically

To follow is an example of how to access the remote server programatically:

RemoteProjectManager rpm = RemoteProjectManager.getInstance();
Project p = rpm.getProject("localhost", "Timothy Redmond", "troglodyte", "Newspaper", true);
KnowledgeBase kb = p.getKnowledgeBase();

The code in the box below is another more complicated method that eases the process of opening multiple projects. This approach will allow you to use the same session object to retrieve different remote projects, but you cannot use the same session to open a single project twice.

Project p = null;
try {
    RemoteServer server = (RemoteServer) Naming.lookup("//localhost/" + Server.getBoundName());
    if (server != null) {
        RemoteSession session = server.openSession("Timothy Redmond",
                                                   SystemUtilities.getMachineIpAddress(), 
                                                   "troglodyte");
        if (session != null) {
            RemoteServerProject serverProject = server.openProject("Newspaper", session);
            if (serverProject != null) {
                p = RemoteClientProject.createProject(server, serverProject, session, true);
            }
        }
    }
} catch (Exception e) {
    Log.getLogger().severe(Log.toString(e));
}

Some other methods in the RemoteServer interface that may be of interest are

  • getAvailableProjectNames(RemoteSession)
  • allowsCreateUsers()
  • createUser(String, String)
  • createProject(...)
  • shutdown()

To disconnect from the server, you only need to call the dispose method on the client side project:

    p.dispose();

Configuration Settings

There are several settings that can be configured in the client-server mode.

Server Side Settings

We have discussed some of the system properties:

-Dprotege.rmi.server.port=...
-Dprotege.rmi.registry.prot=...

... above. There is another setting that can be very important for performance:

 -Dserver.use.compression=true

This option enables compression on the client-server connection. It will increase the cpu usage on the client and the server but can provide quite a significant improvement in bandwidth usage (early experiments suggest a a compression factor of up to 10 to 1 and performance on the client is doubled for some large ontologies).

Finally, the option:

-Dtransaction.level=...

... controls the level of protection associated with transactions. The options are:

  • NONE - which means that transactions do not even have rollback capabilities. This is not recommended.
  • READ_UNCOMMITTED - which means that users can see other users changes even if the other user is in a transaction and has not committed the transaction.
  • READ_COMMITTED - which means that user operations made during a transaction are not seen by other users until they are committed.
  • REPEATABLE_READ - which means that the system assures a user in a transaction that data he reads will not change for the duration of the transaction.
  • SERIALIZABLE - which means that transactions are serializable. This is the most stringent form of transaction processing but it is also the most expensive and the most likely to cause problems with locked databases.

Client Side Settings

There are also optimizations that allow the user to control caching on the client. On the client side the user can specify that certain data on the server will cache information about certain frames from the server side ontology while the client initializes. For example, if the client adds the system properties:

-Dserver.client.preload0=Oncogene_TIM -Dserver.client.preload1=Gene_Kind

... then the client will preload Oncogene_TIM and Gene_Kind and their parents before it starts up. For large ontologies and bad network latency this can allow a user to start editing a selected set of classes without having to wait for the client to get the data from the server. The initialization of these classes is included in the Protege client startup.

The option:

-Dserver.client.preload.skip=true

... disables the pre-caching of frames on the client during the Protege client initialization process. The option:

-Dpreload.frame.limit=...

... limits the number of frames that the client will preload during initilialization.

Debug and Performance Monitoring

Client or Server Side Settings

We have several tools to monitor and debug the performance of the Protege client-server. The logging.properties settings:

java.util.logging.ConsoleHandler.level=FINE

...

edu.stanford.smi.protege.server.socket.MonitoringInputStream.level=FINE
edu.stanford.smi.protege.server.socket.CompressingInputStream.level=FINE

will start a reasonable amount of logging that will give you some idea of what traffic is going across the wire. These settings can be used on either the server or the client and they will measure and report on the low level traffic being received by rmi. The vast majority of the traffic is going from the server to the client so we anticipate that this setting will generally be used on the client. Here is a sampling of the information generated

Average Compression Ratio = 10.412 to 1, Compressed = 0.27 MB, Uncompressed = 2.76 MB (Cumulative)
InputStream 0: 3 megabytes read
Average Compression Ratio = 11.277 to 1, Compressed = 0.33 MB, Uncompressed = 3.77 MB (Cumulative)

The input stream line shows the data read from the socket. The compression ratio lines show the difference between the compressed data that actually goes over the wire and the uncompressed data that is read by the caller.

Client Side Settings

In addition we have tools for simulating real world delays in a localhost setting. The following setting only applies on the client:

-Dserver.delay=80 

Note that in previous versions of Protege the server.delay option was set on the server side. This setting simulates an 80 millisecond delay for the latency associated with each rmi call. This is a very large delay for latency (a single call is noticeable by a Protege user) but experience has shown that this value does appear in real world installations.

In addition, we have the following jvm properties that can be applied on either the server or the client:

-Dserver.upload.kilobytes.second=128 -Dserver.download.kilobytes.second=500

However we anticipate that these settings will be more useful if applied on the client because then YourKit can determine how much time is being used for latency and how much time is being used for downloads and uploads. If set on the client the upload time tells the client to simulate delays to mimic a transfer rate for data from the client to the server and the download time tells the client to simulate delats to mimic the transfer rate for data from the server to the client. Note that some tools measure upload and bandwidth in kilobits per second. So a dsl connection with 6 megabits/second download and 512 megabits/second upload is getting 768 Kilobytes/second download and 64 Kilobytes/second upload.


Resources

You may take a look at Collaborative Protege that provides support for collaboration (e.g., change tracking, annotations, discussion threads on ontology entities, chat, etc.). Collaborative Protege works in client-server mode with different ontology languages (OWL, RDF, Frames) and for different Protege backends (files or databases).