STARTS
Stanford Protocol Proposal for Internet Search and Retrieval

Reference Implementation


Q:  What's new in Release 2.0?

A:  STARTS now uses CORBA as a transport layer

Introduction

IDL creation

Server changes

Client changes


Introduction:

In order to use STARTS in combination with other CORBA based digital library services, we revised it to use CORBA as a transport layer rather than HTTP.  Since the STARTS protocol was written to use the SOIF protocol over HTTP, the implication is that the IDL will become a part of the STARTS protocol (rather than SOIF objects).  This is not a big leap conceptually, and with few exceptions, the IDL clearly corresponds to the published STARTS protocol (HTML version / PostScript version).

The CORBA STARTS implementation is based on the Release 1.1 STARTS Reference Implementation, in which attribute sets are first class objects that can be related to STARTS sources.

IDL creation

The IDL  very closely matches the STARTS protocol (HTML version / PostScript version).    It starts out with five exceptions potentially thrown during method calls, and one constant declaration.  Then there are a number of data types, which almost entirely correspond to the SOIF objects and SOIF attributes indicated in the STARTS protocol.   The one exception has to do with content summary term statistics, which is clarified in the method explanations below.  Lastly, the IDL has six method specifications.

IDL methods:

  1. createQuery 

    This is a shortcut method to facilitate the creation of query objects.  It is possible for a client to build an sQuery object from scratch, but this shortcut method allows the client to ensure that most values will have meaning at the server.  This is also an easy way for the client to get the server defaults for query objects (for use in a query input form, for example). 

  2. submitQuery 

    This method takes an sQuery data type (a query object) as an argument and returns search results in the form of an sQResultSeq object.

  3. getMetaAttributes

    Given a source, this method returns the source metadata in the form of a sMetaAttributes data type.

  4. getContentSummaryInfo 

    Given a source, this method returns all of the content summary information except the content summary term statistics.  Because CORBA has difficulty transporting large objects, the actual term statistics are handled with a separate method.  The sContentSummaryInfo object indicates if fields are indicated in term statistics and which fields are indicated at the source.  This information is used when retrieving content summary term stats (see getCSTermStatData method).

  5. getCSTermStatData

    CORBA has difficulty transporting large objects, so content summary term statistics are passed separately from the other content summary information.   Since there can be a large number of term statistics for a given field, the server will need to limit the number of term statistics it will pass per field.  This method is designed to be used repeatedly, with the "previousLastTerm" designated in sCSTermStatDatReqElement objects indicating the last term for which we already have term statistics from the server.  If we want term statistics for the first term, "previousLastTerm" is set to the IDL constant "noPreviousLastCSTerm".  Data returned by the server has a designation "containsAllTermStats" in the sCSTermStatDataElement objects to indicate whether there are more term statistics for this field.

  6. getResourceMetadata

    This method returns the STARTS server's resource metadata.

Server changes

Our ORB (OrbixWeb 3.0) choked on null values in passed CORBA objects, so our code reflects this quirk.  Aside from getting the server to speak OrbixWeb 3.0 flavor CORBA, the following changes were made:

Since the changes for Release 2.0 were broad, we recommend that you reinstall all the code.  For details on the organization of the code, see the Implementation Notes.

Client changes

The use of CORBA allowed for a GUI client, rather than a CGI script.  Our client uses java 1.1 event handling, so it must be run in a content that handles 1.1 events.   At press time, none of the web browsers do this, so the client was written so that it can be run as an applet or an application.  We checked the applet via our development enviroment, but mostly we used it as an application.  Our client speaks OrbixWeb 3.0 flavor CORBA.

The client is not fancy -- it proves the full functionality of the CORBA STARTS Reference Implementation, but we did not invest a lot of time in design or layout.   There are four classes:


Send questions to help@ncstrl.org