Open Archives Software 2000-03-07 13:54:51 -0500
Open Archives Reference Software
This document describes the use of and installation of the Open Archives Software (OA software) subset of the Dienst software. This software provides a simple to install and use front end for archives that choose to support the Open Archives Subset of the Dienst Protocol. This protocol provides a mechanism for harvesting common metadata - the Open Archives Metadata Set - and archive-specific metadata from records (e.g., documents) in participating archives. The Open Archives Initiative home page provides complete information on participation in the initiative.
The OA Software is small set of Perl files that manage dispatching of protocol requests defined by the Open Archives Subset. The OA Software is intended for use in conjunction with site-specific software that manages the individual archive. Use of the OA software will require programming to establish the actual functional interface between the dispatched protocol requests and the individual archive.
Organizations wishing to participate in the Open Archives Initiative that do not already have archive software should look at the full Dienst software release.
The Open Archives software is designed and written to be run in conjunction with an HTTP server. (In fact, the protocol is designed to be embedded in URLs carried in HTTP requests). The installation instructions support two mechanisms for linking an HTTP server to the Open Archives Software:
Using standard CGI, which is supported by virtually all HTTP servers (although the OA Software is intended to be run with the Apache HTTP server).
Using mod_perl, which embeds a persistent Perl interpreter and the Open Archives software in an Apache HTTP server. This significantly speeds up the handling of protocol requests by avoiding the overhead of starting the Perl interpreter at each request. mod_perl is only supported for various flavors of UNIX (e.g., linux, solaris, hp-ux).
All of the Perl code in the Open Archives Software will run on any computer system that supports Perl (many flavors of UNIX, many flavors of Windows, MacOS). However, use of the OA Software in conjunction with an HTTP server requires URL rewriting (in order to redirect Open Archive protocol requests to the OA Software). To the best of our knowledge, URL rewriting is available only through the mod_rewrite module in Apache. While Apache is supported on both flavors of Unix and flavors of WIN32, the follow caveat for WIN32 exists (lifted from the Apache Windows Web Page):
Warning: Apache on NT has not yet been optimized for performance. Apache still performs best, and is most reliable on Unix platforms. Over time we will improve NT performance. Folks doing comparative reviews of webserver performance are asked to compare against Apache on a Unix platform such as Solaris, FreeBSD, or Linux.
Furthermore, installers of the software who wish to exploit the performance gains offered by mod_perl can only do so on UNIX systems.
Any system that is capable of hosting a Web server should be capable of running the Open Archives Software. That includes a standard desktop workstation (e.g., Sun, IBM, etc.) or a garden variety Pentium-class PC running Linux (i.e., 400 MhZ processor, 128M or memory, Ethernet connection, multi-gigabyte hard disk). Obviously the higher the capacity of the system, in terms of both processor and memory, will determine its ability to handle a very high volume of HTTP and Open Archive protocol requests. Disk space consumption by the actual Dienst software is minimal - in the several megabyte range. We expect that actual hardware requirements for running an archive site will depend on the archive-specific software rather than the Open Archives Software itself.
The following software is required for installation and execution of the Open Archives Software:
Perl - minimum version 5.005_xx. Available from http://www.perl.com.
Perl Modules from CPAN (already may be installed in your Perl configuration).
Apache - minimum version 1.3.x. Available from http://www.apache.org.
mod_perl - minimum version 1.2.x. Available from http://perl.apache.org.
The physical organization of the OA Software is as follows. The code is organized into five directories:
Main - These are the source files that are associated with the main entry point of the software. This code manages receiving the hand-off of Open Archives protocol requests from the HTTP server (through mod_perl or CGI), parsing of the requests, and dispatching them.
Common - These are source files that contain common utilities for the software.
Config- These are source files that contain information for localizing the software. Some of the variable settings in this directory need to be changed at installation time.
Services/Respository - The source files containing protocol definitions and the stub functions that provide the interface to local archive functionality.
Services/Info - The source files containing protocol definitions and the stub functions that provide the basic service information.
Instructions on changes to these files at installation time are provided in the installation section.
The following steps should be followed in to install the Open Archives Software. Note that the installation process assumes that your site has not installed and is already running an Apache HTTP server. If you are already running Apache, you will need to modify the configuration file for that server as described below:
Download the latest version of the Apache source into a
(Skip this step if you already have Apache installed and running at your site). The Apache source is located here. Untar the Apache source file and create the Apache source directory (the full path of that directory will be called apache_src in the following steps).
Download the latest version of mod_perl source into the
same temporary directory.
(Skip this step if you already have an Apache server built with mod_perl, or if you wish to use standard CGI - which will degrade performance of your Open Archives Server). The mod_perl source is located here. Untar the mod_perl source file and create the mod_perl source directory (the full path of that directory will be called mod_perl_src in the following steps).
In the mod_perl_src directory run the following commands:
perl Makefile.PL \ APACHE_SRC=apache_src \ DO_HTTPD=1 \ USE_APACI=1 \ PREP_HTTPD=1 \ EVERYTHING=1 make make install
Note that to run these commands you will need write access to your Perl installation (in most cases this means that you have root access to your machine). Note that detailed information on installing mod_perl is available in the installation files in the mod_perl source directory.
Choose the directory into which you wish to install Apache (this will be called apache_run for the remainder of this document). In the apache_src directory run the following commands:
./configure \ --prefix=apache_run \ --activate-module=src/modules/perl/libperl.a \ --enable-module=rewrite \ --enable-shared=rewrite make make install
Note that detailed information on installing Apache is available in the installation files in the Apache source directory.
Go to apache_run/conf and edit the httpd.conf file. Find the line that says:
where xxxx is a number like 8090 and either leave it or change it to the port on which you want to run your Apache HTTP server. Now go to apache_run/conf and run the command:
Your Apache server should start. If it doesn't, refer to the Apache documentation for help.
Download the Open Archives Software.
The OA Sofware is available here. Untar the source into a directory that is readable by the Apache Web server installed above. This directory will be called OA_src for the remainder of this document. The directory tree below OA_src should look like that described above.
Configure the Open Archives Software.
You must modify two files in order to configure the Open Archives Software:
Edit the file at OA_src/Main/dienst.pl and locate the setting of the variable $dienst::source_dir. Modify the path setting of this variable as indicated in the comment that accompanies it.
Edit the file at OA_src/Config/config_constants.pl follow the instructions in the comments that show which variables should be modified. Set those variable appropriately.
Configure the Apache Server to use the Open Archives
Go to apache_run/conf and edit the httpd.conf file. Add the following lines at the end of the file.
RewriteRule ^/Dienst(.*) OA_src/Main/dienst.pl
allow from all
Note that the line allow from all specifies that all clients can execute the Open Archives protocol requests. If you want more constrained access consult the Apache documentation.
Now go to apache_run/conf and run the command:
to restart your Apache Server with the configuration changes.
Perform Basic Installation Tests
Edit the file OA_src/Tests/InstallTest.htm and change all occurrences of the string host:port to the actual host and port of your Apache Server. Open this file in a Web browser and check that each link successfully returns an XML document with a root element that has the same name as the verb of the respective request. The root element should have a single attribute named version, which has a value that is the version of the verb of the respective request. For example, a Disseminate verb with version 1.0 will produce text/xml content with an wrapped in a tag like
You now need to do the actual programming task of linking the Open Archives Software to the archive software that you are running at your site. All Open Archives Protocol requests are dispatched to the set of subroutines in the file OA_src/Services/Repository/Repository_stubs.pl. This file is commented to show the points at which each protocol request is handled and at which point you should insert the linkages to your own archive code. When customizing the code for your individual site you should refer to: