The STARTS server (Release 1.1 and up) is designed to support multiple attribute sets. The reference implementation currently supports two attribute sets:
This new class should extend the abstract class AttrSet. The new attribute set class should define the static array "AttrSetFields" that lists all of the fields that constitute the new attribute set. The format for this array is: {"<field>", "<boolean value>"}, as in the following example:
public static String [] [] AttrSetFields = {
{"TITLE", "true"},
{"CREATOR", "true"},
{"SUBJECT", "false"}
};
The boolean values "true" and "false" are required-field indicators. Enter "true" if the field must be recognized by a source and "false" if the field may optionally be recognized by a source. Currently, the STARTS implementation does not interrogate this boolean value, so it exists for reference only. The implementation does load this array into a hashtable and perform a simple key lookup to determine whether a field is part of a particular attribute set (see method check() in class Field ).
Additionally, the new attribute set class should define instance variables that indicate which of these fields are "activated" in a particular source-specific instance of the attribute set class. Attribute-set classes are instantiated in the context of a particular source -- each instance of a source carries around an attribute set object that lists all possible fields for the attribute set, as well as a list of those fields that map to data elements in the source. These "source supported" fields should be defined in the "fieldsTranslation" array in the source classes found in the package resource (see #2 below). This array should also contain each field's WAIS equivalent, and a list of languages supported for each field.
When creating a new attribute set class, use the existing classes AttrSetBasic1 and AttrSetDcore1 for reference. The variables and methods implemented in these classes will serve as a guide in creating a new attribute set. Also, note that the attribute-set constructor method receives the "fieldsTranslation" array as an argument from the source:
public AttrSetDcore1(String [][] fieldsTranslation) {
LoadFieldTable();
LoadSourceFields(fieldsTranslation);
LoadFieldXlate(fieldsTranslation);
}
Each source must identify the fields it supports. While an attribute-set class will statically define the fields that are part of that attribute set in general, a source class will statically define only those attribute-set/field combinations that are supported by the source. (See classes CSTRSourceDescription and LINUXSourceDescription.)
When a new attribute set is defined, existing sources should be examined to determine which fields have the same meaning as the new attribute-set fields. To enable the use of a new attribute set for queries against an existing source, the new attribute-set fields must be mapped to the data elements that exist for the source. Each new attribute-set field should be entered into the "fieldsTranslation" array, along with its WAIS translation and a list of languages supported in the following format:
{"<attribute set>", "<field>", "<WAIS translation>", "<languages supported>"}
Example:
protected static String[][]fieldsTranslation = {
{"basic-1", "author", "au", ""},
{"basic-1", "title", "ti", ""},
{"basic-1", "linkage", "id", ""},
{"dcore-1", "CREATOR", "au", ""},
{"dcore-1", "TITLE", "ti", ""},
{"dcore-1", "IDENTIFIER", "id", ""}
};
Since modifiers are designated by attribute set according to STARTS, but are not associated with the attribute set class at the present time, when an attribute set is added, we need to add associated modifiers to the modifier tables. There are currently two relevant static arrays named "modifiersSupported" and "modifiersTranslation" in class WAISSourceDescription which will need to be modified.
"modifiersSupported" contains the attribute-set qualified modifier and "true" or "false" to indicate if the modifier is supported at this field, followed by a list of languages:
{"<attribute set> <modifier>", "<true or false>", "<list of languages>"}
Example:
protected static String[][]modifiersSupported = {
{"basic-1 <", "true", ""},
{"basic-1 =", "true", ""},
{"basic-1 stem", "false", ""},
{"basic-1 phonetic", "true", ""},
{"dcore-1 <", "true", ""},
{"dcore-1 =", "true", ""},
{"dcore-1 stem", "false", ""},
{"dcore-1 phonetic", "true", ""}
};
"modifiersTranslation" should have an entry for supported modifier (indicated as "true" in the "modfiersSupported" array. "modifiersTranslation" contains the attribute-set qualified modifier and the WAIS equivalent of the modifier:
{"<attribute set> <modifier>", "<WAIS equivalent of modifier>"}
Example:
protected static String[][]modifiersTranslation = {
{"basic-1 <", "<"},
{"basic-1 =", "="},
{"basic-1 phonetic", "SOUNDEX"},
{"dcore-1 <", "<"},
{"dcore-1 =", "="},
{"dcore-1 phonetic", "SOUNDEX"}
};
Each source must identify the modifiers it supports for each field. As with the "fieldsTranslation" table in number 2. above, the modifiers supported for each field are statically defined at the source. (See classes CSTRSourceDescription and LINUXSourceDescription.)
When a new attribute set is defined, existing sources should be examined to determine which modifiers should be associated with which WAIS fields. To enable the use of a new attribute set modifier for queries against an existing source, the new attribute-set qualified modifier must be mapped to the data elements that exist for the source. Each supported attribute-set modifier (see number 3. above) should be entered into the "WAISFieldSupportedForModifier" array, along with the WAIS fields it can modify in the following format:
{"<attribute set> <modifier>", "<WAIS field names>"}
Example:
protected static String[][]WAISFieldSupportedForModifier = {
{"basic-1 <", "dm"},
{"basic-1 =", "au ti dm any id bd"},
{"basic-1 phonetic", "au ti id bd"},
{"dcore-1 <", "dm"},
{"dcore-1 =", "au ti dm any id bd"},
{"dcore-1 phonetic", "au ti id bd"}
};
Each source class should contain the static hashtable "attrSetsTable" with entries for each attribute set that the source supports. The key to the hashtable is the attribute-set name, and the value is an attribute-set object. A static method initializes the "attrSetTable" hashtable by creating an entry for each attribute set and instantiating a new attribute-set object. The new attribute-set object is instantiated with the source-specific field information in the "fieldsTranslation" array:
static Hashtable attrSetsTable = new Hashtable();
static {
attrSetsTable.put(
"basic-1",
new AttrSetBasic1(fieldsTranslation));
attrSetsTable.put(
"dcore-1",
new AttrSetDcore1(fieldsTranslation));
}
STARTS specifies that date fields must be in ISO-1807 yyyy-mm-dd format. Our code is a little clunky with this one: it checks for "date" or "Date" or "DATE" in the fieldname. So if your attribute set has a date field that doesn't meet this criteria, there are two places in the code you will need to change.
- in query/Lstring in the Check() method
- in wais/WAISSourceDescription in the TermToFilter() method
The StartsServer is structured in manner that permits easy addition or modification of the WAIS sources. The reference implementation provides access to two sample sources:
Steps for changing the sources are as follows:
// Load the sources hashtable static { sources.put("cstr", new CSTRSourceDescription()); sources.put("linux", new LINUXSourceDescription()); }
You should modify this code so that the keys in the hashtable correspond to the names of your sources, and their values the class that is the description of that source.
Using the reference implementation to support another native search engine (not freeWAIS-sf) is, by nature, a more complicated task. However, StartsServer is structured in a manner that allows this to be done via sub-classing rather than rewriting source and engine independent pieces of the code. All wais-specific code is isolated to the package wais. The two core classes in this package are:
You will need to create two such sub-classes for the engine to which you wish to provide access. You can then add new sources for this engine, in a manner similar to that described above.
Send questions to help@ncstrl.org