STARTS Reference Implementation - Release 1.1

Extended Attribute Set Support

Introduction

Description of Changes

Issues Not Addressed

Demonstration

Direct questions to help@ncstrl.org


Introduction:

This version of the STARTS reference implementation was inspired by the desire to support the Dublin Core attribute set in addition to the basic-1 attribute set that was developed with the STARTS protocol. While conceiving of the changes required to allow the introduction of the Dublin Core attribute set to the existing STARTS reference implementation, we began to consider an attribute set as an abstraction in itself. Influenced by the proposed container architecture described in the Warwick Framework, we now regard an attribute set as a first class object that has a relationship with a STARTS source, but is not part of the source itself. Consistent with this perspective, we created STARTS 1.1 which is a first step in making the reference implementation compliant with the evolving notion of metadata packages in a container architecture. This version of the reference implementation achieves the following specific goals:

  1. Attribute sets are first class objects that can be linked to one or more STARTS sources.  
  2. STARTS sources can support one or more attribute sets
  3. STARTS queries can handle explicit attribute set qualifiers on fields
  4. STARTS results can report fields from one or more attribute sets

 

 

Description of Changes:

The STARTS 1.0 protocol permits the use of attribute sets other than basic-1. The reference implementation provided for this via variables and methods in the source classes, most notably through static arrays that define attribute-set/field pairs. This embedding of attribute-set-related data and methods in STARTS source classes did not recognize the attribute set as a significant entity. Also, while the source classes allowed for the definition of fields from different attribute sets, other parts of the system could not support the use of alternate attribute sets in STARTS queries. For instance, the parser grammar (see classes YYparse and YYlex) was unable to accommodate queries that specified an attribute set other than basic-1. In particular, the lexer defined a token called "BASIC1-FIELD" and scanned for each of the basic-1 fields.

To rectify these problems and achieve the goals stated in the introduction, the following changes to STARTS were made:

1. Make the parser more generic, and not tied to specific attribute-set fields.

The lexical analyzer and parser grammar was changed to identify an attribute-set and a field, but not particular values for these two entities. It is preferable to have a generic parser, one that can detect syntactical patterns without pre-defining too much. Again, attribute-set definitions are more appropriate in attribute-set objects, not in parsing machinery.

Problems:

Changes:

 

- field_spec

- [attribute_set field_spec]

 

2. Make attribute sets first-class objects with their own variables and methods.

An attribute set is now a first-class object. Each attribute set is defined as a sub-class of the abstract class named AttrSet. With this new abstraction, objects can be created that can carry around information about each attribute set in general (static data), as well as attribute- set information in the context of a specific source. Using attribute-set objects, we can discover what fields are part of the attribute set, and whether these fields are supported in the context of a particular source. Also, we can obtain WAIS translations of attribute-set fields in the context of a particular source. The previous STARTS implementation embedded attribute set information and behaviors inside source objects. This limited the flexibility and extensibility of both existing and future attribute sets. The new design should allow the system to better handle the integration of additional attribute sets. Also, it positions STARTS to evolve with metadata developments such as the Warwick Framework's container architecture which supports multiple metadata "packages" for a digital object.

 Changes to Java Classes:

 

Created classes AttrSet (abstract), with sub-classes AttrSetBasic1 and AttrSetDcore1:

 

Class CSTRSourceDescription and Class LINUXSourceDescription:

 

Class WAISSourceDescription:

String translatedField = query.source.GetAttrSet(attrSet).TranslateField(field);

Class SourceDescription

 

Class Field

public boolean Supported_p() {

return((query.source.GetAttrSet(GetAttributeSet())).FieldSupportBySource(GetFieldName()));

}

- Is the attribute set supported by the source? (AttrSetSupported_p)

- Is the field part of the specified attribute set? (FieldinAttrSet)

- Is the attribute-set field supported by the source in the context of the query? (Supported_p)

 

Class Document

@SQRDocument{

Version{10}: STARTS 1.1

[basic-1 author]{12}: Lagoze, Carl

[basic-1 title]{99}: dkfjkdjfkdjfkjdjfdkjfkdjkfjdkjfdkjfk

[dcore-1 IDENTIFIER]{99}: CORNELLCS:jfkdjfkdjkfjdjfkdjfkdj

In this example we see the that the requested "answer fields" are author and title from the basic-1 attribute set, and IDENTIFIER from the dcore-1 (Dublin Core) attribute set. Although the mixing of fields from different attribute sets in the "answer field" specification may not seem practical in the example above, we can envision cases where this would be practical. For instance, a source may have been created with MARC records, but the mapping to another attribute set has been enabled (such as a mapping to Dublin Core). If the user opts to primarily "speak" Dublin Core to the system, but there are MARC fields that do not map to the Dublin Core Attribute set, then the ability to mix and match attribute-set/field combinations, as in the above example, becomes useful. Essentially, this provides the ability to operate with a preferred attribute set across multiple sources, even if the underlying documents/document surrogates were created with using another attribute-set template.

Class CSTRDocument and Class LINUXDocument

static Hashtable transTable = new Hashtable();

static {

transTable.put("ti", "TITLE");

transTable.put("au", "AUTHOR");

transTable.put("dm", "ENTRY");

transTable.put("id", "ID");

transTable.put("bd", "BODY");

}

The method GetFieldValueFromDB uses this hashtable. Changes were made to Class WAISResultDocument to ensure that a WAIS field is passed to this method.

Class WAISResultDocument

 

Issues Not Addressed: