Dienst Software 2000-04-24 10:09:23 -0400
This document describes the Dienst software, which can be installed at multiple sites to create a distribute digital library. This software is based on two other entities that share the use of the word "Dienst":
This document assumes that the reader understands the architecture document and has read the introductory sections (everything but the verbs) of the protocol document.
This document gives a relatively brief overview of the software implementation and distribution. This is intended as an introduction to the full installation document.
The Dienst software consists of a set of service modules that correspond to the services defined in the Dienst protocol. Running the Dienst server entails supporting one or more of these service modules on a server that runs in conjunction with an HTTP server. A Dienst digital library consists of a set of these services (either running on a single server/host or distributed servers/hosts) that interoperate through the Dienst protocol.
The operational semantics of each of the services are described in the protocol document. A brief description of the implementation of each service in the software distribution is as follows:
The Dienst software is written in Perl. In addition to the basic Perl Distribution, Dienst requires a number of additional Perl modules that are all available from CPAN. The exact modules required are list in the installation document.
A number of other software packages are required depending on the services installed:
Installation of the index service requires a search engine able to process fielded queries and return ranked results. The Dienst implementation is configured to use freeWAIS-sf.
The Dienst software is designed and written to be run in conjunction with an HTTP server. (In fact, the protocol is designed to be embedded in URLs carried in HTTP requests). For the remainder of this document, a distinction is made between two entities:
The use of the phrase Dienst Service refers to a software module that provides the functionality of a set verbs defined for that service in the Dienst protocol.
The use of the phrase Dienst Server refers to a software module associated with an HTTP server (through mechanisms defined below) that provides access to one or more Dienst Services.
While it would be possible to link Dienst to an HTTP server as a CGI process, this would be extremely inefficient. First, as in all Perl CGI implementations, each Dienst request would require the overhead of starting the Perl interpreter, and loading and compiling the Dienst code. Second, there are initialization steps that need to be performed each time a Dienst server starts up that would be onerous if necessary for each Dienst protocol request..
As an alternative to CGI, the Dienst software is designed to run as an Apache module using mod_perl. This method embeds a persistent Perl interpreter and Dienst server in the Apache server thus avoiding the overhead of starting an external interpreter and the penalty of Perl start-up time. The Dienst software is essentially part of an Apache thread for the life of the thread . The only overhead encountered is in the first Dienst protocol request to an Apache thread and, of course, the additional memory payload of each Apache thread.
Dienst can run in conjunction with an existing Apache Web server, which provides standard HTTP functionality to your organization. The only modifications that need to be made to an Apache Web server to run Dienst (other than the building of the server with mod_perl) are the settings in the Apache configuration file to route Dienst requests to the proper script. These configuration file changes are described in the installation document.
All of the Perl code in Dienst will run on any computer system that supports Perl (many flavors of UNIX, many flavors of Windows, MacOS). Unfortunately the other software on which some services in Dienst relies (freeWAIS-sf, PerlMagick, Apache, and mod_perl) are much more Unix dependent (some of these software packages have Windows versions but we have found them to unstable or difficult to install). As a result, the entire package of Dienst software (all services) is only supported on UNIX (including LINUX, Solaris, AIX, Ultrix, and HP-UX). It may also run on other flavors of Unix - the main dependency is porting the non-Perl software.
In the future, we hope to support at least parts of the Dienst software on Windows NT or Windows 2000. For example, the repository service has no external package dependencies - except for mod_perl. We are examining using PerlEx from ActiveState, a Perl/Web server integration using Microsoft IIS. (Apache for Windows is not a stable product.) Information will be posted at this site if and when that port is done.
Any system that is capable of hosting a Web server should be capable of running Dienst. That includes a standard desktop workstation (e.g., Sun, IBM, etc.) or a garden variety Pentium-class PC running Linux (i.e., 400 MhZ processor, 128M or memory, Ethernet connection, multi-gigabyte hard disk). Obviously the higher the capacity of the system, in terms of both processor and memory, will determine its ability to handle a very high volume of HTTP and Dienst requests. Disk space consumption by the actual Dienst software is minimal - in the several megabyte range. Actual consumption of disk space depends on the amount of content in the Dienst repository that will be hosted.
The physical organization - the directory layout - of the Dienst code corresponds to the service-orientation of the Dienst architecture. The code is organized into four groups:
Main source files - These are the source files that are associated with the main entry point of Dienst. This code manages receiving the hand-off of Dienst protocol requests from the HTTP server (through mod_perl), parsing of the requests, and dispatching them to the appropriate Dienst service.
Common source files - These are source files, other than main source files, that are not service specific. Examples of functionality in these source files is code for assembling requests to other services and servers, client code for the collection service, and output formatting.
Configuration source files - These are source files that contain information for customizing the generic (non-service related) components of the Dienst software. All configuration of Dienst at installation time is restricted to these files and the service specific <service>_init.pl that are described below.
Service source files - These are the source files that provide the functionality for the individual Dienst services (e.g., repository, index, collection). Each service is organized into its own source directory and in its own Perl package. There are no code dependencies (i.e., subroutine calls) among services. Each service only has code dependencies on the Common source files. All communication among services takes place through protocol requests regardless of whether the communicating services are running within the same Web server (in the future we may optimize for this case). Each service directory has a distinguished file, called <service>_init.pl that contains the configuration information for that service.
Complete documentation for installing Dienst is available in the installation instructions. The major decision to be made before installing Dienst is the specific services that are to be installed. Since each of the services is a stand-alone module, it is possible to configure a server with any set of the available services (e.g., Repository and Index). It is expected that most Dienst installations will be made by organizations wishing to join an existing distributed digital library. The administrator of the digital library can offer guidance on service selection.
This section provides a very brief overview of the steps for installing Dienst, which are as follows:
If necessary, download Apache and mod_perl, build it for your computer system, and install it.
If necessay, download the current Perl release, build it for your computer system, and install it.
Download, build, and install the CPAN Perl modules that are required to run Dienst (specified in the installation instructions), build them, and install them.
Download the latest version of the Dienst software.
Install the Dienst software. The software release includes an installation script which automates the modification of the Apache configuration file and the customization of the configuration files in Dienst. The installation process of the Dienst software has two separate components:
Configuration of the generic server.
Configuration of services. At installation time you decide which services you wish to install and configure only those services. Verbs associated with services that you don't install will not be available to clients.
Test the installed server with the test scripts.
It is expected that most Dienst installations will be made by organizations wishing to join an existing distributed digital library. The information in this section does not apply to that type of installation.
Others may be interested in configuring one or more Dienst servers to create a new digital library. This involves a number of tasks including deciding the services offered by distributed servers and the configuration of collection servers and regions (described in the Dienst Architecture Summary). This information is provided in the Dienst Administration Document.