AttributeMarker is an interface that models the phenomenon
of an object that knows how to look for a particular attribute
of a citation (title, author, volume etc).
CitationOutput s the superclass of each class that
puts the citation data onto the specified printwriter
in the one specific format requested by doTXT, doHTML or doXML.
doit initialises the citation harvesting process by setting up the debugging
stream, storing the document id, creating an entity encoder if necessary
and calling the readLoop to process all the citations.
extended is a debugging relic which controls whether the original
author string is emitted along with the rest of the XML output for
immediacy of comparison.
EXTRA -
Static variable in class uk.ac.soton.harvester.Deciter
EXTRA is the index of the object in the AttributeMarkers array that recognises
the position of any extra features (e.g.
firstNameFirstHint declares that the citation style tends to put
the first name before the surname, at least after the initial author
has been dealt with (surnames always come first for first authors
so that you can see the primary sort key).
isBook is a utility method that encapsulates a naive heuristic
(oh, alright then, hack) for determining whether the
citation was to a book/thesis or not.
ISOLatHashTable provides a hash table which is
already filled in with a mapping between the
ISOLatin-1 entity names and the character
positions by which they are represented.
ISOLatRevHashTable is the inverse of
ISOLatHashTable, and provides a hash table which is
already filled in with an inverse mapping between the
ISOLatin-1 entity names and the character
positions by which they are represented.
isProceedings is a utility method that encapsulates a
naive heuristic (oh, alright then, hack) for determining
whether the citation was to a conf/workshop proceedings
notAuthor is the first potential author-string token which
seems to not be an author name.
NUMBERING -
Static variable in class uk.ac.soton.harvester.Deciter
NUMBERING is the index of the object in the AttributeMarkers array that recognises
any initial preporocessing before the recognition proper gets underway.
POSTPROCESS is the index of the object in the AttributeMarkers array that performs
any subsequent postprocessing and rationalisation of the marker values.
PREPROCESS is the index of the object in the AttributeMarkers array that performs
any initial preprocessing before the recognition proper gets underway.
split_multiCitation
If significant citation material is found to be left over with a multiCite
hint in operation, it may be assumed that another citation occurrence has
been found and dodecite may be called recursively.